CN112200152A - Super-resolution method for aligning face images based on residual back-projection neural network - Google Patents

Super-resolution method for aligning face images based on residual back-projection neural network Download PDF

Info

Publication number
CN112200152A
CN112200152A CN202011281052.2A CN202011281052A CN112200152A CN 112200152 A CN112200152 A CN 112200152A CN 202011281052 A CN202011281052 A CN 202011281052A CN 112200152 A CN112200152 A CN 112200152A
Authority
CN
China
Prior art keywords
resolution
image
feature map
residual
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011281052.2A
Other languages
Chinese (zh)
Other versions
CN112200152B (en
Inventor
陆耀
王学博
陈晓珍
王子建
李玮琪
李公平
吴紫薇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Media Group
Original Assignee
China Media Group
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Media Group filed Critical China Media Group
Publication of CN112200152A publication Critical patent/CN112200152A/en
Application granted granted Critical
Publication of CN112200152B publication Critical patent/CN112200152B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a super-resolution method for aligning a face image based on a residual back projection neural network, belonging to the technical field of image processing. The method adopts a mode of combining iterative back projection and a deep learning neural network, and amplifies the ultra-low resolution face image by 8 times through three steps. (1) Inputting the ultra-low resolution face image into a neural network, extracting depth features, and simultaneously amplifying a low resolution feature map to 128x128 size by adopting a deconvolution network. (2) Inputting the feature map with the size of 128 × 128 obtained in the step (1) into a residual back projection unit of a neural network, obtaining a compensated 128 × 128 high-resolution feature map through continuous iteration, and generating a final 128 × 128 high-resolution image by the high-resolution feature map obtained in the step (2) through a convolution layer (3). The method has clear modules and simple steps, and the super-resolution effect and efficiency meet the super-resolution requirement of the actual low-resolution face image.

Description

Super-resolution method for aligning face images based on residual back-projection neural network
Technical Field
The invention relates to a super-resolution method for aligning a face image based on a residual back projection neural network, belonging to the technical field of image processing.
Technical Field
In the field of computer vision research, face image super-resolution is an important sub-topic, and not only has many practical application scenes, but also is the basis of other research topics.
From the practical significance, many intelligent applications do not leave the support of the face image super-resolution technology, and the most important applications are like urban monitoring systems: with the rapid development of economy, more and more video surveillance cameras are used beside people, and the cameras are mainly used for building urban video surveillance systems and play an important role in public security organ criminal investigation business. However, in the process of acquiring a face by an actual camera, the face information is often difficult to directly identify, and the main reasons are as follows: on one hand, the distance between the camera and the face is usually long, and the low-resolution target image is difficult to provide enough detail information for recognition; on the other hand, in the video monitoring system, the optical devices are fuzzy, and the interference of the field environment, the transmission compression noise and the like causes the detail information of the target object to have errors, so that the feature information required by the face identification is difficult to provide. Therefore, it is a core technical requirement of video monitoring service to perform resolution enhancement processing on an actual low-resolution face image and further enhance the identifiability of a target image.
From the scientific research point of view, with the rapid development of artificial intelligence technology, computer vision tasks are developing continuously as one of the cores. Classical vision task research such as image classification, target detection and face recognition is becoming mature, and the premise is that high-resolution images are required to be provided, and face images belong to subsets of the high-resolution images. Therefore, super-resolution research of the face images can be regarded as the basis of the advanced vision tasks, such as the above tasks, higher-definition images are provided, the image classification result can be more accurate, the detection accuracy of target detection is higher, and the recognition rate of face recognition is higher. Before other visual tasks of the real image are carried out, the quality of the image is improved through a super-resolution method.
Face Super-Resolution (FSR), Single Image Super-Resolution (SISR) belonging to a specific category. The method aims to process a Low Resolution (LR) face image by using an algorithm, and improve the Resolution of the image, so that a clear High Resolution (HR) face image is obtained. The super-resolution is an algorithm for reversely solving from a low-resolution image to a high-resolution image, and due to the fact that the prior information of the image is insufficient due to the loss of high-frequency information in the image degradation process, the super-resolution is a problem of getting more ill conditions, and the solution of the ill conditions is also a problem of difficulty and hot points which are always concerned in a plurality of research fields, so that the super-resolution has high academic research value.
Disclosure of Invention
The invention aims to provide a super-resolution method for aligning face images based on a residual back-projection neural network, aiming at the problems that the existing aligned low-resolution face images have poor visual effect and are difficult to apply to the existing face analysis system and aiming at amplifying the ultra-low-resolution face images.
The invention is realized by the following technical scheme.
The super-resolution method for aligning the face image based on the residual back projection neural network comprises the following steps:
step 1, cutting a low-resolution face image to obtain a face image with a cut face area;
the size of a face area in the face image is 16 pixels;
step 2, carrying out height alignment treatment on the face image of the face area cut out in the step 1 to enable eyes of the face image to be on a horizontal straight line to obtain the highly aligned face image;
step 3, extracting the edge image of the highly aligned human face image obtained in the step 2 by using a sober operator;
step 4, channel merging is carried out on the edge image extracted in the step 3 and the highly aligned face image, and an image after channel merging is obtained;
step 5, extracting the depth features of the image after channel merging in the step 4, and amplifying the feature map of the low-resolution image to 128 × 128 size by using an iterative back projection mode to obtain a 128 × 128 feature map, which specifically comprises the following substeps:
step 5.1, extracting 256-dimensional depth features of the image merged by the channels in the step 4 by using the 3-by-3 convolution layer of the neural network;
step 5.2 mapping the 256-dimensional depth features extracted in step 5.1 into 64-dimensional features using 1 x1 convolutional layers;
step 5.3 scale up the 64-dimensional features to 128x128 size using a deconvolution layer with convolution kernel size 12 x12, step size 8, and padding of 2 x 2, resulting in a 128x128 feature map;
step 6, using convolution kernel size of 12 × 12 and step size of 8, filling convolution layer of 2 × 2, down-sampling feature map of 128 × 128 size to 16 × 16 size, and subtracting the feature map with 64-dimensional feature extracted in step 5.2 to obtain residual feature map;
step 7, amplifying the residual feature map to 128 × 128 size by using a deconvolution layer with convolution kernel size of 12 × 12, step size of 8 and filling of 2 × 2, adding the residual feature map to the 128 × 128 feature map obtained in the step 5 to obtain a compensated feature map, and calling the feature map as residual iterative back projection;
and 8, extracting the edge map of the residual iterative back projection obtained in the step 7, and adding the edge map to super-resolution reconstruction, wherein the method specifically comprises the following steps: extracting a residual iterative back-projected edge map by using a convolution kernel of 3 x 3, and supervising the generation of the edge map by using a label image of the edge map;
step 9, combining the edge image generated in the step 8 with the residual iterative back projection generated in the step 7, generating a final high-resolution face image by using a convolution layer, and performing supervised training by using a high-resolution face label image;
so far, from step 1 to step 9, the super-resolution method for aligning the face image based on the residual back projection neural network is completed.
Advantageous effects
Compared with the existing super-resolution method for aligning the face images, the super-resolution method for aligning the face images based on the residual back projection neural network has the following beneficial effects:
1. the peak signal-to-noise ratio (PSNR) of the high-resolution face image generated by the invention is higher;
2. the high-resolution face image generated by the invention has higher Structural Similarity (SSIM);
3. the high-resolution face image generated by the invention has better visualization effect.
Drawings
FIG. 1 is a flow chart of a super resolution method for aligning face images based on a residual back projection neural network according to an embodiment of the present invention;
fig. 2 is a super-resolution visualization result of a face image.
Detailed Description
The super-resolution method for aligning a face image based on a residual back-projection neural network according to the present invention is described in detail below with reference to the accompanying drawings and specific embodiments.
Example 1
The example illustrates the specific implementation of the super-resolution method for aligning face images based on the residual back-projection neural network.
When the super-resolution method for aligning the face images is implemented, a test is carried out by using a celebA data set of an open-source face image data set, and the data set contains 20 ten thousand face images on the front face. 5000 face images are randomly sampled to serve as a verification set, 1000 face images serve as a test set, and the rest face images serve as the verification set. The training, validation, and testing steps for neural networks are consistent, except for the different data sets used. The experimental environment adopted by the invention is as follows: the hardware system is a TiTan X independent display card, the video memory is 12G, the software system is ubuntu14.04, and a python pytorch framework is used. Using peak signal-to-noise ratio (PSNR) and structure similarity measurement (SSIM) as super-resolution evaluation index
The super-resolution method for aligning the face images, disclosed by the invention, comprises the specific implementation steps as shown in figure 1.
As can be seen from fig. 1, the super-resolution method includes the following steps:
step 1) carrying out uniform preprocessing on a face image data set, and cutting out a 128x128 size face region image of a face part
Step 2) carrying out height alignment treatment on the face image of the face area cut out in the step 1, so that eyes of the face image are on a horizontal straight line to obtain the highly aligned face image;
step 3) extracting the edge image of the highly aligned human face image obtained in the step 2 by using a sober operator;
step 4) simultaneously sampling the edge graph extracted in the step 3) and the highly aligned face image to 16 × 16 by using a bicubic difference downsampling mode;
and 5) carrying out channel combination on the edge image obtained by down sampling in the step 4) and the highly aligned face image.
And 6) extracting the depth features of the image after the channels are merged in the step 5), and enlarging the feature map of the low-resolution image to 128 × 128 size by using an iterative back projection mode. Specifically, 256-dimensional depth features of the input image are extracted using 3 × 3 convolutional layers of the neural network, and the 256 features are mapped to 64-dimensional features using 1 × 1 convolutional layers. The features were then scaled up to 128x128 size using a deconvolution kernel size of 12 x12, step size 8, filled with 2 x 2 deconvolution layers.
Step 7), using the convolution layer with the same parameters to sample the feature map with the size of 128 × 128 back to the size of 16 × 16, and subtracting the feature map from the input feature map to obtain a residual feature map;
and 8) amplifying the residual characteristic map to 128 × 128 size by using the deconvolution layer, and adding the residual characteristic map to the 128 × 128 characteristic map obtained in the previous step to obtain a compensated characteristic map. This is called residual iterative back-projection. The same iteration process is carried out for 7 times in total;
and 9) extracting the edge map of the high-resolution feature map obtained in the step 8, and adding the edge map into the super-resolution reconstruction process. Specific us use a 3 x 3 convolution kernel to extract an edge map of a 128x128 feature map, and use a labeled image of the edge map to supervise the generation of the edge map;
and step 10) combining the edge map generated in the step 9 with the feature map generated in the step 8, and generating a final high-resolution face image by using the convolution layer.
And (3) specific super-resolution result display:
we performed tests on a test set of 1000 low resolution face images and compared them with the best current super resolution methods laprn, DBPN, URDGN, CBN, the results are shown in table 1 below.
TABLE 1 super-resolution test results of face images
Methods Bicubic LapSRN DBPN URDGN CBN Ours
PSNR(dB) 22.2025 23.9884 24.0100 23.6326 23.8004 24.2391
SSIM 0.5653 0.6810 0.6812 0.6710 0.6723 0.6921
As can be seen from the quantitative indexes in Table 1, the super-resolution method for aligning the face image based on the residual back projection neural network is higher than the current best super-resolution method in both peak signal-to-noise ratio (PSNR) and Structural Similarity (SSIM), wherein the PSNR is higher by 2.03dB than the traditional bicubic method and is higher by 0.22dB than the current best method DBPN, and meanwhile, the SSIM evaluation index is higher by 0.13 than the bicubic method and is higher by 0.011 than the current best method DBPN.
Except for quantitative evaluation, qualitative visual comparison is carried out between the method and the currently best super-resolution method LapSRN, DBPN, URDGN and CBN, as shown in the 'super-resolution visualization of face images' in FIG. 2, the structure of the high-resolution face image generated by the super-resolution method based on residual back-projection neural network alignment of the face image is more consistent with the original image, and meanwhile, the detail information is richer.
While the foregoing is directed to the preferred embodiment of the present invention, it is not intended that the invention be limited to the embodiment and the drawings disclosed herein. Equivalents and modifications may be made without departing from the spirit of the disclosure, which is to be considered as within the scope of the invention.

Claims (5)

1. The super-resolution method for aligning the face image based on the residual back projection neural network is characterized in that: the method comprises the following steps:
step 1, cutting a low-resolution face image to obtain a face image with a cut face area;
step 2, carrying out height alignment treatment on the face image of the face area cut out in the step 1 to enable eyes of the face image to be on a horizontal straight line to obtain the highly aligned face image;
step 3, extracting the edge image of the highly aligned human face image obtained in the step 2 by using a sober operator;
step 4, channel merging is carried out on the edge image extracted in the step 3 and the highly aligned face image, and an image after channel merging is obtained;
step 5, extracting the depth features of the image after channel merging in the step 4, and amplifying the feature map of the low-resolution image to 128 × 128 size by using an iterative back projection mode to obtain a 128 × 128 feature map;
step 6, using convolution kernel size of 12 × 12, step size of 8, filling convolution layer of 2 × 2, and down-sampling 128 × 128 feature map to 16 × 16 size to obtain residual feature map;
step 7, amplifying the residual feature map to 128 × 128 size by using a deconvolution layer with convolution kernel size of 12 × 12, step size of 8 and filling of 2 × 2, adding the residual feature map to the 128 × 128 feature map obtained in the step 5 to obtain a compensated feature map, and calling the feature map as residual iterative back projection;
step 8, extracting the edge map of the residual iterative back projection obtained in the step 7, and adding the edge map to super-resolution reconstruction to generate an edge map;
and 9, combining the edge image generated in the step 8 with the residual iterative back projection generated in the step 7, generating a final high-resolution face image by using a convolution layer, and performing supervised training by using the high-resolution face label image.
2. The super-resolution method for aligning face images based on residual back-projection neural network as claimed in claim 1, wherein: in step 1, the size of the face area in the face image is 16 pixels.
3. The super-resolution method for aligning face images based on residual back-projection neural network as claimed in claim 1, wherein: step 5, specifically comprising the following substeps:
step 5.1, extracting 256-dimensional depth features of the image merged by the channels in the step 4 by using the 3-by-3 convolution layer of the neural network;
step 5.2 mapping the 256-dimensional depth features extracted in step 5.1 into 64-dimensional features using 1 x1 convolutional layers;
step 5.3 magnifies the 64-dimensional features to 128x128 size using a deconvolution layer with convolution kernel size of 12 x12, step size of 8, and padding of 2 x 2, resulting in a 128x128 feature map.
4. The super-resolution method for aligning face images based on residual back-projection neural network as claimed in claim 1, wherein: the residual feature map in step 6 was obtained by using a convolution kernel size of 12 × 12, step size of 8, and filling 2 × 2 convolution layers to down-sample the 128 × 128 feature map to 16 × 16 size, and subtracting the 64-dimensional feature extracted in step 5.2.
5. The super-resolution method for aligning face images based on residual back-projection neural network as claimed in claim 1, wherein: step 8, specifically: extracting residual iterative back-projected edge maps using a 3 x 3 convolution kernel, and using labeled images of the edge maps to supervise the generation of the edge maps.
CN202011281052.2A 2019-12-06 2020-11-16 Super-resolution method for aligning face images based on residual back projection neural network Active CN112200152B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911240207.5A CN110991355A (en) 2019-12-06 2019-12-06 Super-resolution method for aligning face images based on residual back-projection neural network
CN2019112402075 2019-12-06

Publications (2)

Publication Number Publication Date
CN112200152A true CN112200152A (en) 2021-01-08
CN112200152B CN112200152B (en) 2024-04-26

Family

ID=70090657

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201911240207.5A Withdrawn CN110991355A (en) 2019-12-06 2019-12-06 Super-resolution method for aligning face images based on residual back-projection neural network
CN202011281052.2A Active CN112200152B (en) 2019-12-06 2020-11-16 Super-resolution method for aligning face images based on residual back projection neural network

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201911240207.5A Withdrawn CN110991355A (en) 2019-12-06 2019-12-06 Super-resolution method for aligning face images based on residual back-projection neural network

Country Status (1)

Country Link
CN (2) CN110991355A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106600538A (en) * 2016-12-15 2017-04-26 武汉工程大学 Human face super-resolution algorithm based on regional depth convolution neural network
WO2018099405A1 (en) * 2016-11-30 2018-06-07 京东方科技集团股份有限公司 Human face resolution re-establishing method and re-establishing system, and readable medium
CN108447020A (en) * 2018-03-12 2018-08-24 南京信息工程大学 A kind of face super-resolution reconstruction method based on profound convolutional neural networks
CN109325915A (en) * 2018-09-11 2019-02-12 合肥工业大学 A kind of super resolution ratio reconstruction method for low resolution monitor video
CN109671023A (en) * 2019-01-24 2019-04-23 江苏大学 A kind of secondary method for reconstructing of face image super-resolution
CN110276721A (en) * 2019-04-28 2019-09-24 天津大学 Image super-resolution rebuilding method based on cascade residual error convolutional neural networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018099405A1 (en) * 2016-11-30 2018-06-07 京东方科技集团股份有限公司 Human face resolution re-establishing method and re-establishing system, and readable medium
CN106600538A (en) * 2016-12-15 2017-04-26 武汉工程大学 Human face super-resolution algorithm based on regional depth convolution neural network
CN108447020A (en) * 2018-03-12 2018-08-24 南京信息工程大学 A kind of face super-resolution reconstruction method based on profound convolutional neural networks
CN109325915A (en) * 2018-09-11 2019-02-12 合肥工业大学 A kind of super resolution ratio reconstruction method for low resolution monitor video
CN109671023A (en) * 2019-01-24 2019-04-23 江苏大学 A kind of secondary method for reconstructing of face image super-resolution
CN110276721A (en) * 2019-04-28 2019-09-24 天津大学 Image super-resolution rebuilding method based on cascade residual error convolutional neural networks

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
XUEBO WANG 等: "Asymmetric Pyramid Based Super Resolution from Very Low Resolution Face Image", 《PRCV (2) 2019》, pages 694 - 702 *
XUEBO WANG 等: "RBPNET: An Asymptotic Residual Back-Projection Network for Super Resolution of Very Low Resolution Face Image", 《ICONIP (2) 2019》, pages 1 - 12 *
ZHI-SONG LIU 等: "Joint Back Projection and Residual Networks for Efficient Image Super-Resolution", 《2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC)》, pages 1 - 7 *
卢涛 等: "基于边缘增强生成对抗网络的人脸超分辨率重建", 《华中科技大学学报(自然科学版)》, vol. 48, no. 1, pages 1 - 6 *
安耀祖 等: "一种自适应正则化的图像超分辨率算法", 《自动化学报》, vol. 38, no. 4, pages 601 - 608 *
李公平 等: "基于模糊核估计的图像盲超分辨率神经网络", 《自动化学报》, pages 1 - 14 *
王一宁 等: "基于残差神经网络的图像超分辨率改进算法", 《计算机应用》, vol. 38, no. 1, pages 1 - 9 *

Also Published As

Publication number Publication date
CN112200152B (en) 2024-04-26
CN110991355A (en) 2020-04-10

Similar Documents

Publication Publication Date Title
CN110647874B (en) End-to-end blood cell identification model construction method and application
CN112101451B (en) Breast cancer tissue pathological type classification method based on generation of antagonism network screening image block
CN110544205B (en) Image super-resolution reconstruction method based on visible light and infrared cross input
US8600143B1 (en) Method and system for hierarchical tissue analysis and classification
CN109035172B (en) Non-local mean ultrasonic image denoising method based on deep learning
CN109145745B (en) Face recognition method under shielding condition
CN114241548A (en) Small target detection algorithm based on improved YOLOv5
CN109325915B (en) Super-resolution reconstruction method for low-resolution monitoring video
CN105069818A (en) Image-analysis-based skin pore identification method
JP7427080B2 (en) Weakly supervised multitask learning for cell detection and segmentation
CN112132827A (en) Pathological image processing method and device, electronic equipment and readable storage medium
CN114283285A (en) Cross consistency self-training remote sensing image semantic segmentation network training method and device
CN115393698A (en) Digital image tampering detection method based on improved DPN network
CN111242028A (en) Remote sensing image ground object segmentation method based on U-Net
Afshari et al. Single patch super-resolution of histopathology whole slide images: a comparative study
CN105069767B (en) Based on the embedded Image Super-resolution reconstructing method of representative learning and neighborhood constraint
CN114118123A (en) Fluorescence-stained urine exfoliated cell identification method and system
CN118053551A (en) Video generation method and video analysis model training system
Guo [Retracted] System Analysis of the Learning Behavior Recognition System for Students in a Law Classroom: Based on the Improved SSD Behavior Recognition Algorithm
Cao et al. A novel image multitasking enhancement model for underwater crack detection
CN116385957A (en) X-ray image contraband detection method, device, equipment and medium
CN112200152B (en) Super-resolution method for aligning face images based on residual back projection neural network
Ji et al. No-reference image quality assessment for dehazed images
CN116071307A (en) Pavement defect detection model building method, detection method, storage medium and equipment
CN109064403A (en) Fingerprint image super-resolution method based on classification coupling dictionary rarefaction representation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant