CN116311462A - Facial image restoration and recognition method combining context information and VGG19 - Google Patents

Facial image restoration and recognition method combining context information and VGG19 Download PDF

Info

Publication number
CN116311462A
CN116311462A CN202310306314.3A CN202310306314A CN116311462A CN 116311462 A CN116311462 A CN 116311462A CN 202310306314 A CN202310306314 A CN 202310306314A CN 116311462 A CN116311462 A CN 116311462A
Authority
CN
China
Prior art keywords
network
vgg19
face image
restoration
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310306314.3A
Other languages
Chinese (zh)
Inventor
陈波
陈圩钦
邓媛丹
曾俊涛
朱舜文
王庆先
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202310306314.3A priority Critical patent/CN116311462A/en
Publication of CN116311462A publication Critical patent/CN116311462A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a face image restoration and identification method combining context information and VGG19, relates to the technical field of image processing, and aims to solve the technical problems that the information is lost and the face restoration cannot be efficiently solved in the existing face incomplete restoration method; the invention comprises the following steps: firstly, constructing and training a generating countermeasure network for face restoration; inputting the occlusion face image data set 1 into the trained generation countermeasure network to obtain a repaired face image data set 2; then inputting the non-occluded face image data set into a VGG19 network, and training the VGG19 network for classification and identification; the obtained face image restoration data set 2 is preferably input into the trained VGG19 recognition model obtained in the step 3, and a final recognition result of the restoration image is obtained; the invention replaces part of standard convolution with expansion convolution, is beneficial to expanding the acceptance domain of the repair model, combines the repair module and the classification recognition module, and can improve the recognition rate of the incomplete face image.

Description

Facial image restoration and recognition method combining context information and VGG19
Technical Field
The invention relates to the technical field of image processing, in particular to a face image restoration and identification method combining context information and VGG 19.
Background
An image is a very important and often used information carrier whose integrity influences how much information we acquire. Image restoration is taken as an important research content in the field of computer vision, and is mainly aimed at filling the missing part in the image, so that the method is widely applied to scenes such as face recognition, old photo restoration, cultural relic restoration and the like. However, due to the complexity of the acquisition environment and the dynamic variability of the acquisition object, the problem of incomplete acquired face information generally exists, which seriously affects the accuracy of face recognition and limits the application range of face recognition.
Regarding the incomplete face image recognition method, two types are classified: firstly, repairing the incomplete image by a repairing method and then classifying the incomplete image, and secondly, directly identifying the incomplete image. There is room for further development in both types of methods, and it is of interest to current researchers to find a simple and effective method of repairing incomplete faces.
In recent years, the generation of countermeasure networks (Generate Adversarial Networks, GAN) exhibits excellent effects in image generation, and great progress has been made in image restoration, but there are still some problems to be solved. First: the existing model is too free to train, is easy to lose direction and has poor stability. Second,: the real environment is more changeable and complex, and the image restoration of a large missing area is difficult due to lack of continuity of textures. Third,: the training balance degree between the generator and the discriminator is difficult to grasp, and gradient explosion is easy to occur to cause training failure. In the aspect of classification and identification, the network structure of the VGG19 is regular, the classification performance is excellent, and the VGG19 is one of the commonly used classification and identification networks.
Disclosure of Invention
In order to solve the problems in the prior art, the invention aims to provide a face image restoration and identification method combining context information and VGG19, and aims to solve the technical problems that the information is lost and the face restoration cannot be efficiently solved in the existing face incomplete restoration method.
A face image restoration and identification method combining context information and VGG19 comprises the following steps:
step 1: acquiring a face image data set 0, randomly shielding the face image data set 0 to serve as a shielded face image data set 1, and dividing the face image data set 1 into a training sample and a test sample according to a proportion;
step 2: constructing a generating countermeasure network for face restoration and training the generating countermeasure network by using the training sample obtained in the step 1;
step 3: inputting the test sample obtained in the step 1 into the generated countermeasure network trained in the step 2, and generating a countermeasure network output repair image as a face image data set 2;
step 4: inputting the face image data set 0 obtained in the step 1 into a VGG19 network, and training the VGG19 network for classification and recognition to obtain a VGG19 recognition model;
step 5: and (3) inputting the face image data set 2 obtained in the step (3) into the VGG19 recognition model obtained in the step (4) to obtain a final recognition result.
Preferably, the generating the countermeasure network in step 2 includes generating a network and discriminating the network, the generating network employing a modified convolutional neural network, embedding a hole convolutional layer while using a conventional convolutional with a batch normalization layer.
Preferably, the improved convolutional neural network employs a leak ReLU as the activation function.
Preferably, the discrimination network adopts a dual-discriminator structure, and comprises a global context discriminator network and a local context discriminator network, wherein an extra Wasserstein loss is added in the training process of the discrimination network to help the training of the network, and the Wasserstein distance is calculated in the following way:
Figure BDA0004146947740000021
wherein, gamma to pi (p,q) Representing a set of all possible joint distributions of the distribution p and the distribution q combined, gamma being the set pi (p,q) Random variable in (a). x, y is a random variable, and W (p, q) represents the Wasserstein distance.
Preferably, the VGG19 network in step 4 includes 16 convolution layers and 3 fully-connected layers, where the convolution layers are arranged in a sequential manner, each layer is followed by a rectifying linear unit ReLU activation function and a max pooling layer, and the ReLU activation function is as follows:
Figure BDA0004146947740000022
where x is the input variable.
Preferably, in the training process, the VGG19 introduces a scaling factor to each layer of channels, performs sparse regularization to automatically identify unimportant channels, prunes channels with smaller scaling factor values, obtains a compact model after pruning, and then performs fine tuning on the compact model.
The invention has the beneficial effects that:
(1) The repair model is light, and partial standard convolution is replaced by expansion convolution, so that the acceptance domain of the repair model is enlarged;
(2) The repair module and the classification and identification module are combined, so that the identification rate of the incomplete face image can be improved;
(3) The discrimination network of the generation network is of a double-discriminator structure, so that the global consistency and the local consistency of the generated image can be ensured.
Drawings
Fig. 1 is a flowchart of a face image restoration recognition method combining context information and VGG19 according to embodiment 1.
Fig. 2 is a generated countermeasure network training process according to embodiment 1.
Fig. 3 is a schematic diagram of the structure of a generation network according to embodiment 1.
Fig. 4 is a schematic diagram of the configuration of the discrimination network according to embodiment 1.
Fig. 5 is a schematic structural diagram of VGG19 according to embodiment 1.
Fig. 6 is an image completion result obtained by the method of example 1 involving using a randomly generated mask on CELEBA.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.
Specific embodiments of the present invention will be described in detail below with reference to fig. 1-6;
example 1
A face image restoration and identification method combining context information and VGG19 comprises the following steps:
step 1: adding irregular shielding to the face area in the complete face image data set 0 to generate a shielding face image data set 1; 80% of images are selected from the occlusion face image dataset 1 to serve as training samples, and the remaining 20% are used as test samples.
Step 2: training a generating countermeasure network for face restoration by using a training sample, wherein the generating countermeasure network comprises a generating network and a judging network, and inputting the training sample into the generating network to obtain a generating sample; inputting the generated sample into a discrimination network model for true and false discrimination; if the result is true, outputting the result, if the result is false, adjusting the generating network, regenerating the sample until the output result is true, and generating a training flow of the countermeasure network as shown in the figure 2;
the generation network employs a modified convolutional neural network that uses not only a conventional convolutional with a batch normalization layer, but also embeds some hole convolutional layers, as shown in fig. 3.
By adding hole convolution in the convolution layer, the convolution kernels with the same kernel size have larger acceptance area under the same parameters and calculation amount, and the reduction of image resolution and the loss of information are avoided, which is very important for the image restoration task, because the background is important for reality. Meanwhile, we use the leak ReLU as an activation function so that, during back propagation, gradients for portions with inputs less than zero (instead of 0 as in ReLU) can also be calculated. Therefore, the sparsity is introduced for the network by the leak ReLU, and the calculation efficiency is further improved. The specific expression of the leak ReLU is:
Figure BDA0004146947740000031
where x is the input variable and a is a very small constant.
The discrimination network adopts a double-discriminator structure, and comprises global discrimination and local discrimination, so as to obtain discrimination loss. The discrimination network structure is shown in fig. 4.
The global context arbiter is combined with the local context arbiter network in order to determine whether the image is real or generated by the patch network. They differ in that the global arbiter receives the entire image, while the input of the local arbiter is only a small block of the image that needs to be completed. Through the operation, the consistency of the whole structure and the semantics of the image can be ensured, and consistency of the complement details can be paid attention to.
In addition, an extra Wasserstein penalty is added to the training process to aid in the training of the network. The general objective loss function of generating an countermeasure network model uses KL (Kullback Leibler) divergence of the generated image distribution and the true image distribution to minimize and maximize JS (Jensen-Shannon) divergence of both, which would lead to unstable model gradients. The wasperstein distance is more sensitive to changes and can provide a meaningful gradient, which can help our patching network train better. The Wasserstein distance is calculated by:
Figure BDA0004146947740000041
wherein, gamma to pi (p,q) Representing a set of all possible joint distributions of the distribution p and the distribution q combined, gamma being the set pi (p,q) Wherein x, y is a random variable and W (p, q) represents the Wasserstein distance.
Step 3: inputting the test sample into the trained generated countermeasure network in the step 2 to obtain the face image restoration dataset 2.
Step 4: and taking the face image data set 0 as a training sample, inputting the training sample into the VGG19 network, and training the VGG19 network for classification and identification.
The VGG16 architecture consists of 19 layers, including 16 convolutional layers and 3 fully-connected layers; the convolutional layers are arranged in a sequential fashion, with each layer being followed by a rectifying linear unit (ReLU) activation function and a max pooling layer. The network structure of VGG19 is shown in fig. 5, and the ReLU activation function is as follows:
Figure BDA0004146947740000042
where x is the input variable.
In the training process, a scaling factor is introduced into each layer of channels, and sparse regularization is performed to automatically identify unimportant channels. Channels with smaller scale factor values will be pruned. After pruning, a compact model is obtained, which is then fine-tuned to achieve an accuracy comparable to (even higher than) a full network of normal training.
Step 5: and (3) taking the face image restoration data set 2 obtained in the step (3) as a test set, and inputting the test set into the trained VGG19 recognition model obtained in the step (4) to obtain a final recognition result of the restoration image.
The trimmed VGG model is a lightweight and efficient convolutional neural network model, and compared with a VGG19 model which is not optimally trimmed, the trimmed VGG model has fewer parameters, the calculation operation is reduced in multiple, and the performance of the model is not affected.
We performed training tests for image restoration and identification on the CelebA dataset. 202599 pictures containing 10177 celebrities were all marked and the repair results were as in fig. 6. Furthermore, to better verify the performance of the proposed recognition model, we compared the new model with different data to the conventional VGG16 to demonstrate the performance of our method and employ accuracy and Top2 indicators. Since the area of the defect area significantly affects the repair effect of the method, we will compare under different defect areas in order to fully measure the capability of the method. The specific configurations are respectively 10% -15% and 20% -30%. The results are shown in table 1 below:
TABLE 1
Figure BDA0004146947740000051
Example 2
On the basis of embodiment 1, the present embodiment inputs the face image dataset 3 with the occlusion into the trained generating countermeasure network to obtain the corresponding repair image, and inputs the corresponding repair image into the VGG19 recognition model to obtain the final recognition result of the face image dataset 3 with the occlusion, namely, completes the image repair of the face image dataset 3 with the occlusion.
The foregoing examples merely represent specific embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that, for those skilled in the art, several variations and modifications can be made without departing from the technical solution of the present application, which fall within the protection scope of the present application.

Claims (6)

1. The face image restoration and identification method combining the context information and the VGG19 is characterized by comprising the following steps:
step 1: acquiring a face image data set 0, randomly shielding the face image data set 0 to serve as a shielded face image data set 1, and dividing the face image data set 1 into a training sample and a test sample according to a proportion;
step 2: constructing a generating countermeasure network for face restoration and training the generating countermeasure network by using the training sample obtained in the step 1;
step 3: inputting the test sample obtained in the step 1 into the generated countermeasure network trained in the step 2, and generating a countermeasure network output repair image as a face image data set 2;
step 4: inputting the face image data set 0 obtained in the step 1 into a VGG19 network, and training the VGG19 network for classification and recognition to obtain a VGG19 recognition model;
step 5: and (3) inputting the face image data set 2 obtained in the step (3) into the VGG19 recognition model obtained in the step (4) to obtain a final recognition result.
2. The face image restoration and recognition method combining context information and VGG19 according to claim 1, wherein the generating the countermeasure network in step 2 includes generating a network and discriminating the network, the generating network employing an improved convolutional neural network, embedding a hole convolutional layer while using a conventional convolutional with a batch normalization layer.
3. The method of facial image restoration recognition combining context information and VGG19 according to claim 2, wherein the modified convolutional neural network employs a leak ReLU as an activation function.
4. The face image restoration and recognition method combining context information and VGG19 according to claim 2, wherein the discrimination network adopts a dual-discriminator structure, and includes a global context discriminator network and a local context discriminator network, and the discrimination network adds an extra waserstein loss in the training process to help the training of the network, and the waserstein distance is calculated in the following manner:
Figure FDA0004146947720000011
wherein, the liquid crystal display device comprises a liquid crystal display device, γ~(p,q) representing a set of all possible joint distributions of the distribution p and the distribution q combined, gamma being the set pi (p,q) Wherein x, y is a random variable and W (p, q) represents the Wasserstein distance.
5. The face image restoration and recognition method combining context information and VGG19 according to claim 1, wherein the VGG19 network in step 4 comprises 16 convolution layers and 3 fully connected layers, wherein the convolution layers are arranged in a sequential manner, each layer is followed by a rectifying linear unit ReLU activation function and a maximum pooling layer, and the ReLU activation function is as follows:
Figure FDA0004146947720000012
where x is the input variable.
6. The face image restoration and recognition method combining context information and VGG19 according to any one of claims 1-5, wherein the VGG19 introduces a scaling factor to each layer of channels during training, performs sparse regularization to automatically recognize unimportant channels, prunes channels with smaller scaling factor values, and then obtains a compact model after pruning, and then performs fine tuning.
CN202310306314.3A 2023-03-27 2023-03-27 Facial image restoration and recognition method combining context information and VGG19 Pending CN116311462A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310306314.3A CN116311462A (en) 2023-03-27 2023-03-27 Facial image restoration and recognition method combining context information and VGG19

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310306314.3A CN116311462A (en) 2023-03-27 2023-03-27 Facial image restoration and recognition method combining context information and VGG19

Publications (1)

Publication Number Publication Date
CN116311462A true CN116311462A (en) 2023-06-23

Family

ID=86783258

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310306314.3A Pending CN116311462A (en) 2023-03-27 2023-03-27 Facial image restoration and recognition method combining context information and VGG19

Country Status (1)

Country Link
CN (1) CN116311462A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114331904A (en) * 2021-12-31 2022-04-12 电子科技大学 Face shielding identification method
CN116895091A (en) * 2023-07-24 2023-10-17 山东睿芯半导体科技有限公司 Facial recognition method and device for incomplete image, chip and terminal

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114331904A (en) * 2021-12-31 2022-04-12 电子科技大学 Face shielding identification method
CN114331904B (en) * 2021-12-31 2023-08-08 电子科技大学 Face shielding recognition method
CN116895091A (en) * 2023-07-24 2023-10-17 山东睿芯半导体科技有限公司 Facial recognition method and device for incomplete image, chip and terminal

Similar Documents

Publication Publication Date Title
CN116311462A (en) Facial image restoration and recognition method combining context information and VGG19
CN110111345B (en) Attention network-based 3D point cloud segmentation method
CN109858613B (en) Compression method and system of deep neural network and terminal equipment
CN109801232A (en) A kind of single image to the fog method based on deep learning
CN111932511B (en) Electronic component quality detection method and system based on deep learning
CN113608916B (en) Fault diagnosis method and device, electronic equipment and storage medium
CN115601661A (en) Building change detection method for urban dynamic monitoring
CN111368707B (en) Face detection method, system, device and medium based on feature pyramid and dense block
CN112329857A (en) Image classification method based on improved residual error network
CN116128820A (en) Pin state identification method based on improved YOLO model
CN113628297A (en) COVID-19 deep learning diagnosis system based on attention mechanism and transfer learning
CN114519819B (en) Remote sensing image target detection method based on global context awareness
Qu et al. UMLE: unsupervised multi-discriminator network for low light enhancement
CN114549959A (en) Infrared dim target real-time detection method and system based on target detection model
CN108550114A (en) A kind of human face super-resolution processing method and system of multiscale space constraint
Sahasrabudhe et al. Structured spatial domain image and data comparison metrics
Zhao et al. Image tampering detection via semantic segmentation network
CN116703885A (en) Swin transducer-based surface defect detection method and system
CN106570928A (en) Image-based re-lighting method
CN116524356A (en) Ore image small sample target detection method and system
Chen et al. A robust object segmentation network for underwater scenes
CN115239655A (en) Thyroid ultrasonic image tumor segmentation and classification method and device
CN113344110B (en) Fuzzy image classification method based on super-resolution reconstruction
CN113724245A (en) Method and system for detecting welding spot defects of PCB plug-in and storage medium thereof
CN113034432A (en) Product defect detection method, system, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination