WO2018099405A1 - 人脸分辨率重建方法、重建系统和可读介质 - Google Patents

人脸分辨率重建方法、重建系统和可读介质 Download PDF

Info

Publication number
WO2018099405A1
WO2018099405A1 PCT/CN2017/113642 CN2017113642W WO2018099405A1 WO 2018099405 A1 WO2018099405 A1 WO 2018099405A1 CN 2017113642 W CN2017113642 W CN 2017113642W WO 2018099405 A1 WO2018099405 A1 WO 2018099405A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
gradient
resolution
standard
feature
Prior art date
Application number
PCT/CN2017/113642
Other languages
English (en)
French (fr)
Inventor
张丽杰
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Priority to US16/062,339 priority Critical patent/US10825142B2/en
Priority to EP17876582.2A priority patent/EP3550509A4/en
Publication of WO2018099405A1 publication Critical patent/WO2018099405A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • G06T3/4076Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution using the original low-resolution images to iteratively correct the high-resolution images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the present invention relates to image processing, and more particularly to a face learning reconstruction method based on machine learning, a reconstruction system, and a readable medium.
  • the face image super-resolution reconstruction technology can be mainly adapted to enlarge the photos stored in the existing IC card, and is convenient for viewing, printing, and the like.
  • Super-resolution reconstruction techniques are especially useful in situations where the cost of replacing existing (storage and acquisition) equipment is high and the feasibility of re-acquisition is low.
  • the face image obtained from the security system and the monitoring device can be super-resolution processed for easy identification. Due to limitations in hardware technology, cost, etc., clear high-resolution images may not be acquired in the surveillance field. Using super-resolution reconstruction technology can reduce the dependence on hardware devices and improve system availability.
  • Embodiments of the present invention provide a method for reconstructing a face resolution, including: acquiring an input image, the input image having a first resolution; determining the input image based on an input image and a standard gradient image library having a second resolution Image gradient information; merging the image gradient information, superimposing the fused gradient information into the input image; generating an output image, the output image having a third resolution, wherein the second resolution And the third resolution is higher than the first resolution.
  • the image gradient information comprises an edge gradient and a second resolution feature gradient.
  • the edge gradient is a face contour gradient
  • determining an edge gradient of the input image includes: upsampling the input image, and retaining a directional characteristic thereof to obtain a An image; calculating gradient information of the first image as edge gradient information of the input image.
  • the second resolution feature gradient is a facial feature gradient corresponding to the input image in a standard gradient image library
  • determining the second resolution feature gradient comprises: estimating a pose of the input image Extracting feature points of the input image; searching for a corresponding facial feature component in the standard gradient image library based on the pose and feature points; calculating a feature gradient of the corresponding facial feature component, and determining the second resolution Rate characteristic gradient.
  • the searching for the corresponding facial feature component comprises: aligning the pose of the input image with the pose of the image sample in the standard gradient image library according to the pose of the input image; according to the input image The feature points are found in the standard gradient image library to identify the facial feature components corresponding thereto.
  • the pose of the input image is aligned with the pose of the image sample in the standard gradient image library according to the following formula:
  • T(x(h), y(h)) is the non-reflective symmetric transformation of the image sample
  • (x(l), y(l)) is the upsampled input image
  • the image gradient information further comprises a background gradient.
  • the background gradient is a gradient of a flat region in the input image
  • determining the background gradient comprises: upsampling the input image to obtain a second image; calculating gradient information of the second image, As the background gradient information of the input image.
  • the acquiring process of the second image includes: selecting a background region of the low resolution image according to a statistical prior mode of self-learning; and transforming the background region by using a set scaling factor s
  • each pixel of the background region is sampled as s*s pixel blocks; the back-propagation algorithm ensures that the upsampled image satisfies the Gaussian distribution.
  • the merging the image gradient information comprises: setting a pixel value of an upper left corner of the image region after the information fusion is equal to a pixel value of an upper left corner of the input image, and then obtaining the fusion information by using a least squares principle:
  • b is the gradient and s is the required fusion information.
  • the reconstruction method further comprises constructing a standard gradient image library.
  • the face resolution reconstruction method further comprises aligning image samples in the standard gradient image library.
  • aligning the image samples in the standard gradient image library comprises:
  • Aligning the remaining image samples in the standard gradient image library with the standard image includes: selecting an image sample from the standard gradient image library as an alignment image, and determining a feature point of the aligned image according to a SIFT algorithm, and finding a standard image from the image
  • the feature point set is the most similar set of feature points
  • the remaining pixels in the aligned image are translated to corresponding positions on the standard image to complete alignment of the aligned image with the standard image.
  • the face resolution reconstruction method further comprises training the standard gradient image library.
  • the step of training the standard gradient image library comprises:
  • Training each image sample in the standard gradient image library comprising: selecting one image sample from the standard gradient image library as a training sample, performing Gaussian smoothing on the training sample and Sampling, obtaining a third image, acquiring a feature mask map from the third image, wherein the feature mask map covers a feature component defined by feature points of the third image;
  • upsampling is performed by bicubic interpolation to obtain the edge gradient:
  • (x, y) is the pixel to be interpolated
  • (x(i), y(j)) is the 4x4 neighborhood point near the point to be interpolated
  • i, j 0, 1, 2, 3
  • W is the weight function .
  • An embodiment of the present invention further provides a face resolution reconstruction system, including: one or more processors; one or more memories, wherein the memory stores computer executable instructions, and the executable instructions when The above-described face resolution reconstruction method is executed while the processor is running.
  • a face resolution reconstruction system further includes an input end that receives an input image having a first resolution, and an output end that outputs an output image having a third resolution, The third resolution is higher than the first resolution.
  • Embodiments of the present invention also provide a computer readable medium storing computer executable instructions and performing the above-described face resolution reconstruction method when the executable instructions are executed by a processor.
  • the invention provides a face resolution reconstruction method based on machine learning, which calculates gradient information of an image from an input image with lower resolution, and searches for a standard gradient image library with higher resolution according to the gradient information.
  • the corresponding facial feature component fuses the acquired gradient information and superimposes the input image to obtain an output image with improved resolution.
  • the overall structural information of the face is preserved in the process of achieving image resolution improvement. To avoid local distortion in the generated output image.
  • FIG. 1 is a flow chart showing a method of reconstructing a face resolution
  • FIG. 2 shows a schematic diagram of generating an output image from image gradient information of an input image
  • Figure 3 is a block diagram showing the alignment of image samples in a standard gradient image library
  • Figure 4 is a schematic view showing the alignment flow in Figure 3;
  • Figure 5 is a flow chart showing the training of image samples in a standard gradient image library
  • Figure 6 shows a schematic diagram of a face resolution reconstruction system.
  • Face resolution reconstruction is a process of generating high-resolution images by means of fidelity based on a low-resolution face image. It can be said that this is a specific area of super-resolution technology.
  • the high-resolution and low-resolution image blocks are modeled according to the probability framework, and high-frequency details are extracted from the standard image.
  • the most similar image blocks are recovered from the standard sample set, and a calculation is performed.
  • the order and second derivative obtain the corresponding high resolution image block.
  • the invention provides a face resolution reconstruction method based on machine learning, which preserves the overall structural information of the face during the reconstruction process, combines the face feature gradient information and the background gradient information, and generates a high resolution according to the standard gradient image library.
  • the face image thereby improving the image quality and adaptive effect of the reconstructed image.
  • the embodiment of the invention proposes a face resolution reconstruction method, which may be an intelligent algorithm executed by a deep learning based neural network (for example, a convolutional neural network).
  • a deep learning based neural network for example, a convolutional neural network.
  • the convolutional neural network can be implemented using a neural network model such as AlexNet, GoogleNet, VGG, Deep Residual Learning.
  • step S101 an input image is acquired, the input image having a first resolution.
  • the input image is a face image to be upgraded, generally has a lower resolution, and the specific structure of the face is blurred, and the high frequency information needs to be supplemented to obtain a clear face image.
  • the face image fully includes information in the low-resolution image, that is, consistent with the structure of the original image, and includes supplementary detail information. This method of improving the resolution of the face image is also called "face fantasy". ".
  • image gradient information of the input image is determined based on the input image and a standard gradient image library having the second resolution.
  • the standard gradient image library includes image samples, the image samples have a second resolution, and the second resolution is a concept opposite to the first resolution, and does not specifically refer to a certain value, which is generally higher than the first a resolution such that the face image includes more detail information to generate detail information of the face features in the input image according to the facial feature component corresponding to the input image in the image library, that is, the input image is guessed according to the standard gradient image library Face detail.
  • the facial feature component can generally refer to facial features such as the eyes, ears, mouth, nose, etc., which are also reference features used to complete face alignment.
  • the reconstruction method further includes constructing a standard gradient image library.
  • step S103 the image gradient information determined in step S102 is fused, and the fused gradient information is superimposed into the input image.
  • step S104 an output image is generated, the output image having a third resolution, which is a concept opposite to the first resolution, which is higher than the first resolution, and does not specifically refer to a certain Value. That is, in the above face resolution reconstruction process, according to the input image with lower resolution, combined with the image sample in the standard gradient image library with higher resolution, the super-resolution output image is obtained, and the output can clearly reflect the person.
  • the facial features of the face can be used in the fields of face monitoring, face recognition, face image expression analysis and the like.
  • the second resolution is higher than the first resolution
  • the third resolution is higher than the first resolution
  • the first, second, and third do not represent any order. , quantity or importance.
  • the first resolution can typically be a low resolution
  • the second resolution can typically be a high resolution.
  • the third resolution is usually super-resolution, that is, super-resolution face reconstruction is implemented by implementing a face resolution reconstruction method on a low-resolution input image.
  • the image gradient information in the above step S102 includes an edge gradient and a second resolution feature gradient.
  • Figure 2 shows an output map for generating high resolution from image gradient information. A schematic diagram of the image. The face resolution reconstruction method will be described in detail below with reference to FIG.
  • the edge gradient is a contour gradient of a face in the input image, which represents the overall structural information of the face.
  • the edge gradient is obtained by up-sampling the input low-resolution image with the retention direction characteristic.
  • the structural information of the edge is preserved by learning and the shape information of the edge is stored to remove the false artifact.
  • the upsampling function of the reserved direction is as follows:
  • An image obtained by performing upsampling of the above-described retention direction characteristic on the input image is referred to as a first image, which retains contour information of a face in the input image for generating face contour information in the high resolution rate output image.
  • the gradient information of the first image is calculated as an edge gradient of the input image.
  • image block division is performed on the image samples of the input image and the standard gradient image library in a back-off manner.
  • image blocks can be divided in order from left to right and top to bottom.
  • the division is performed in a fallback manner from the edge of the image. For example, when it is divided into the right edge of the image, it is retracted to the left with the right edge as a reference, and when it is divided to the lower edge of the image, it is retracted upward based on the lower edge.
  • the second resolution feature gradient characterizes gradient information based on the facial face feature component. As shown in FIG. 2, the process of determining the second resolution feature gradient includes:
  • the finding, according to the pose and the feature point of the input image, the facial feature component corresponding to the standard gradient image library includes: aligning the pose of the input image with the pose of the image sample in the image library according to the pose of the input image . Then, according to the feature points of the input image, the corresponding facial feature component is found in the standard gradient image library.
  • the facial feature component generally refers to facial features such as the eyes, ears, mouth, nose, etc., which are also reference features used to complete face alignment.
  • the image samples in the standard gradient image library have been aligned, and the specific alignment steps are described in detail later.
  • the above alignment and alignment steps are beneficial for more accurate searching, avoiding search failure due to face feature rotation or offset, and the aligned and aligned images can be used for fusion and superposition of gradient information.
  • T(x(h), y(h)) is the non-reflective symmetric transformation of the image sample
  • (x(l), y(l)) is the upsampled input image, when the result of the non-reflective symmetric transformation
  • the image is considered to be aligned.
  • the input image aligned with the image sample in the standard gradient image library is more accurately found to correspond to the facial feature component, and further, the person who is closer to the original input image can be recovered according to the accurately found facial feature component.
  • Face features A facial feature gradient of the found facial feature component is calculated and determined to be the second resolution feature gradient.
  • the image gradient information may further include a background gradient that characterizes a gradient of a flat region in the input image.
  • Obtaining the background gradient comprises: performing upsampling on the input image combined with low-pass filtering, and ensuring that the upsampled image satisfies a Gaussian distribution during the upsampling process, obtaining a second image, and the gradient information of the second image is used as an input image Background gradient information.
  • the background gradient can be used with subsequent edge resolution and third resolution feature gradients for subsequent face resolution reconstruction processing.
  • a small pixel block can be used in the process of acquiring the background gradient to calculate a large expression change.
  • Selecting the background area of the input image according to the statistical prior mode of self-learning, and setting The scale transform factor s transforms the background region into a higher resolution image, and samples each pixel of the background region as s*s pixel blocks.
  • the back propagation algorithm ensures that the upsampled image satisfies the Gaussian distribution, performs low-pass filtering on the obtained background region of the second image, and extracts the gradient as the background gradient therefrom.
  • the obtained image gradient information includes an edge gradient of the input image, a second resolution feature gradient, and a background gradient.
  • the above three gradients represent the gradient information of different components in the input image: the edge gradient represents the contour information of the face, which is used to ensure that the overall structure of the reconstructed image remains consistent with the input image; the second resolution feature gradient represents the specificity in the face a gradient of a facial feature component for generating detail information of a facial component or the like during face resolution reconstruction; the background gradient characterizing information of a flat region in which the structural feature is not apparent in the input image, for use in the generated output image The background is consistent with the input image, further avoiding artifacts, edge local distortion and the like.
  • the image gradient of the input image is obtained based on step S102, and then the image gradient information is merged in step S103, and the merged gradient information is superimposed on the input.
  • the image gradient information includes an edge gradient, a second resolution feature gradient, and a background gradient, and the at least two kinds of gradient information are merged, in other embodiments according to the present invention.
  • One or more of the image gradient information may be included, or other types of image gradient information, and the fusion of the image gradient information is to fuse at least two kinds of gradient information in the image gradient information.
  • the fusion of the gradient information is performed by:
  • the pixel value of the upper left corner of the image region after the information fusion is set is equal to the pixel value of the upper left corner of the input image, and then the fusion information is obtained by the least squares principle:
  • b is the gradient and s is the required fusion information.
  • the edge gradient of the input image, the second resolution feature gradient, and the background gradient are fused to obtain the fused gradient information.
  • the superposition method in step S103 can be obtained by adding and averaging. It can also be implemented using the following formula. Suppose the image is I, the gradient is g, the weight parameter b, b has a value range of (0, 1), and the superimposed image is:
  • the value of b is generally selected by experience.
  • the output image obtained through the above steps has both consistent image information and higher resolution than the input image.
  • parameters such as edge gradient, second resolution feature gradient and background gradient in the generation process, artifacts, ghosting and local distortion are avoided in the generated high-resolution output image, which further improves the face reconstruction. effect.
  • the standard gradient image library used in the face resolution reconstruction method includes a series of face image samples having a higher resolution than the input image.
  • the face reconstruction method provided by the embodiment of the present invention further includes aligning image samples in the standard gradient image library. Before determining the facial feature component corresponding to the input image according to the standard gradient image library, the image samples in the image library should be aligned in order to facilitate accurate searching of the facial feature component and the standard gradient image described below. The training process of the library.
  • FIG. 3 shows a specific alignment flow chart for image samples
  • FIG. 4 shows a schematic diagram of image sample alignment. The alignment process between image samples will be described in detail below with reference to FIGS. 3 and 4.
  • step S1 an image sample is selected from the standard gradient image library as a standard image.
  • the pose of the remaining image samples is aligned with the pose of the standard image.
  • step S2 feature points of the standard image are obtained according to the SIFT algorithm, and a feature point set is selected therefrom and stored.
  • step S3 another image sample is selected from the standard gradient image library as an alignment image, and the feature points of the aligned image are obtained according to the SIFT algorithm, and the feature points most similar to the feature point set of the standard image are found therefrom.
  • Set, the special point set is used for pose alignment.
  • step S4 the aligned image is rotated and scaled until its feature point set has an equal proportional correspondence of feature point sets in the standard image.
  • the specific schematic process is shown in Figure 4.
  • step S5 the feature points of the aligned image are translated to the position of the feature point set of the standard image to obtain a SIFT optical stream.
  • step S6 the remaining pixels in the aligned image are translated to corresponding positions on the standard image by using the optical flow information, and the alignment process of the aligned image and the standard image is completed.
  • the aligned image has a consistent pose with the standard image, which is more conducive to the search step in the face resolution reconstruction method.
  • step S7 it is judged whether there is no alignment with the standard image in the standard gradient image library.
  • Sample image If present, the image samples are selected as aligned images and aligned to perform the alignment process of steps S3-S6. If it does not exist, the alignment process ends.
  • the above alignment process should also be performed on the updated image samples to ensure that the image samples in the standard gradient image library have a consistent pose.
  • the face reconstruction method provided by the embodiment of the present invention further includes training the standard gradient image library.
  • Each image sample in the standard gradient image library corresponds to a feature point set.
  • the specific training process is shown in FIG. 5.
  • step S1 an image sample is selected from the standard gradient image library as a training sample, wherein the image sample has not been selected as a training sample.
  • step S1 an image sample is selected from the standard gradient image library as a training sample, wherein the image sample has not been selected as a training sample.
  • Gaussian smoothing and downsampling on the training sample to obtain a low resolution third image, and acquiring a feature mask map from the third image, the feature mask map covering features of the third image Point the defined feature component.
  • step S2 the most similar facial feature component is searched in the standard gradient image library using the feature mask map obtained in step S1, wherein the search is performed, for example, in a self-learning manner.
  • step S3 feature gradients are extracted from the most similar facial feature components found.
  • step S4 it is judged whether the feature gradient extracted from the most similar facial feature component corresponds to the feature gradient of the training sample, thereby completing the training process for the training sample.
  • step S5 it is determined whether there are any image samples in the standard gradient image library that have not been selected as training samples. If yes, return to step S1, that is, select the image sample as the training, and perform the training steps of S1-S4. If not, end the training.
  • the above training process can also be performed on the updated image samples when updating the image samples in the standard gradient image library.
  • the above training process enables the face reconstruction method according to an embodiment of the present invention to adjust parameters to adapt to the current standard gradient image library, that is, to find the most similar facial feature components more efficiently.
  • the method according to an embodiment of the invention during the lookup process iteratively adjusts its parameters based on the better results obtained.
  • face resolution reconstruction the input low resolution image is equivalent to the training sample here.
  • the present invention proposes a face resolution reconstruction method based on machine learning.
  • the entire structure of the face is preserved by the face edge gradient information, and the face feature gradient information and the background gradient information are combined.
  • the embodiment of the invention further provides a face resolution reconstruction system, and the structure diagram of the face resolution reconstruction system is shown in FIG. 6 .
  • the face resolution reconstruction system includes one or more processors 601 and one or more memories 602. Wherein, the memory stores computer executable instructions, and when the processor 601 executes an instruction stored in the memory 602, the above-described face resolution reconstruction method is executed.
  • the face resolution reconstruction system may further include an input terminal 603 and an output terminal 604. Wherein, the input terminal 603 receives an input image having a lower resolution, and the output terminal 604 outputs an output image processed by the face resolution reconstruction system, the output image having information of a face in the input image, which is also high Details of the resolution.
  • the processor 601 can access the memory 602 through the system bus. In addition to storing executable instructions, the memory 602 can also store training data and the like.
  • the processor 601 can be a variety of computing capable devices such as a central processing unit (CPU) or a graphics processing unit GPU.
  • the CPU can be an X86 or ARM processor; the GPU can be directly integrated directly into the motherboard, or built into the motherboard's North Bridge chip, or built into the central processing unit (CPU), due to its powerful image processing capabilities,
  • Embodiments of the present invention may preferably use a GPU to train a convolutional neural network and perform image processing based on a convolutional neural network.
  • the face resolution reconstruction system may also include data storage accessible by the processor 601 over the system bus.
  • the data store can include executable instructions, multiple image training data, and the like.
  • the face resolution reconstruction system also includes an input interface that allows an external device to communicate with the face resolution reconstruction system.
  • an input interface can be used to receive instructions from an external computer device, from a user, or the like.
  • the face resolution reconstruction system may also include an output interface that interfaces the face resolution reconstruction system with one or more external devices.
  • the face resolution reconstruction system can display an image or the like through an output interface.
  • An external device that is considered to communicate with the face resolution reconstruction system through the input interface and the output interface can be included in an environment that provides a user interface with which virtually any type of user can interact.
  • Examples of user interface types include graphical user interfaces, natural user interfaces, and the like.
  • the graphical user interface can accept input from a user's input device(s), such as a keyboard, mouse, remote control, etc., and provide output on an output device such as a display.
  • the natural language interface can enable a user to interact with the face resolution reconstruction system in a manner that is not constrained by input devices such as keyboards, mice, remote controls, and the like.
  • natural user interfaces may rely on speech recognition, touch and stylus recognition, gesture recognition on and near the screen, aerial gestures, head and eye tracking, Voice and voice, vision, touch, gestures, and machine intelligence.
  • the face resolution reconstruction system is illustrated as a single system, it can be understood that the face resolution reconstruction system can also be a distributed system, and can also be arranged as a cloud facility (including a public cloud or a private cloud). Thus, for example, several devices can communicate over a network connection and can collectively perform tasks that are described as being performed by a face resolution reconstruction system.
  • the embodiment of the invention further provides a computer readable medium comprising a computer readable storage medium.
  • the computer readable storage medium can be any available storage medium that can be accessed by a computer.
  • such computer readable media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, disk storage or other magnetic storage device, or can be used to carry or store instructions or data structures. Any other medium that expects program code and can be accessed by a computer. Additionally, the propagated signals are not included within the scope of computer readable storage media.
  • Computer readable media also includes communication media including any medium that facilitates transfer of a computer program from one place to another. The connection can be, for example, a communication medium.
  • the software uses coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technology such as infrared, radio, and microwave to transmit from a web site, server, or other remote source
  • coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of communication medium.
  • DSL digital subscriber line
  • wireless technologies such as infrared, radio, and microwave are included in the definition of communication medium.
  • Combinations of the above should also be included within the scope of computer readable media.
  • the functions described herein may be performed at least in part by one or more hardware logic components.
  • FPGA Field Programmable Gate Array
  • ASIC Program Specific Integrated Circuit
  • ASSP Program Specific Standard Product
  • SOC System on Chip
  • CPLD Complex Programmable Logic Device

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

本发明提出一种基于机器学习的人脸分辨率重建方法,其在实现图像分辨率提升的过程中保留了人脸的整体结构信息,避免在生成的输出图像中产生局部畸变。所述人脸分辨率重建方法包括:获取输入图像,所述输入图像具有第一分辨率;基于输入图像和具有第二分辨率的标准梯度图像库,确定所述输入图像的图像梯度信息;对所述图像梯度信息进行融合,将融合所得的梯度信息叠加到所述输入图像中;生成输出图像,所述输出图像具有第三分辨率,其中,所述第二分辨率和所述第三分辨率均高于所述第一分辨率。

Description

人脸分辨率重建方法、重建系统和可读介质
本申请要求于2016年11月30日提交的中国专利申请第201611084243.3的优先权,该中国专利申请的全文通过引用的方式结合于此以作为本申请的一部分。
技术领域
本发明涉及图像处理,尤其涉及一种基于机器学习的人脸分辨率重建方法、重建系统和可读介质。
背景技术
人脸图像超分辨率重建技术主要可以适应于对现有IC卡中存储的照片进行放大,便于查看、打印等。在更换现有(存储以及采集)设备的成本较高、重新采集可行性低等多种情况下,超分辨率重建技术尤其适用。
另外,可以对从安保系统、监控设备中获取的人脸图像进行超分辨率处理,以便于识别。由于硬件工艺、成本等限制,在监控领域可能无法采集到清晰的高分辨率图像,使用超分辨率重建技术可以降低对硬件设备的依赖和提高系统的可用性。
发明内容
本发明实施例提供一种人脸分辨率重建方法,包括:获取输入图像,所述输入图像具有第一分辨率;基于输入图像和具有第二分辨率的标准梯度图像库,确定所述输入图像的图像梯度信息;对所述图像梯度信息进行融合,将融合所得的梯度信息叠加到所述输入图像中;生成输出图像,所述输出图像具有第三分辨率,其中,所述第二分辨率和所述第三分辨率均高于所述第一分辨率。
根据本发明实施例,所述图像梯度信息包括边缘梯度和第二分辨率特征梯度。
根据本发明实施例,所述边缘梯度为人脸轮廓梯度,确定所述输入图像的边缘梯度包括:对所述输入图像进行上采样,并保留其方向特性,得到第 一图像;计算第一图像的梯度信息,作为输入图像的边缘梯度信息。
根据本发明实施例,所述第二分辨率特征梯度为标准梯度图像库中与所述输入图像对应的面部特征梯度,确定所述第二分辨率特征梯度包括:估计所述输入图像的位姿,提取所述输入图像的特征点;基于上述位姿和特征点在标准梯度图像库中查找对应的面部特征组件;计算所述对应的面部特征组件的特征梯度,并确定为所述第二分辨率特征梯度。
根据本发明实施例,所述查找对应的面部特征组件包括:根据输入图像的位姿,将所述输入图像的位姿与所述标准梯度图像库中图像样本的位姿进行对齐;根据输入图像的特征点,在所述标准梯度图像库中找出与之对应的面部特征组件。
根据本发明实施例,根据下式将所述输入图像的位姿与所述标准梯度图像库中图像样本的位姿进行对齐:
Figure PCTCN2017113642-appb-000001
其中,T(x(h),y(h))为图像样本的非反射对称变换,(x(l),y(l))为上采样后的输入图像,当非反射对称变换的结果与上采样后的输入图像间的特征点坐标位置差异值达到最小时,确定图像对齐。
根据本发明实施例,所述图像梯度信息还包括背景梯度。
根据本发明实施例,所述背景梯度为输入图像中平坦区域的梯度,确定所述背景梯度包括:对所述输入图像进行上采样,得到第二图像;计算所述第二图像的梯度信息,作为输入图像的背景梯度信息。
根据本发明实施例,所述第二图像的获取过程包括:根据自学习的统计先验方式选取所述低分辨率图像的背景区域;以所设定的尺度变换因子s将所述背景区域变换为高分辨率图像,其中,将背景区域的每个像素上采样为s*s个像素块;通过反向传播算法保证上采样所得的图像满足高斯分布。
根据本发明实施例,对所述图像梯度信息进行融合包括:设定信息融合后的图像区域的左上角像素值等于输入图像左上角像素值,再利用最小二乘原理求得融合信息:
Figure PCTCN2017113642-appb-000002
其中,
Figure PCTCN2017113642-appb-000003
为梯度算子,b为梯度,s为要求得的融合信息。
根据本发明实施例,所述重建方法还包括构建标准梯度图像库。
根据本发明实施例,所述人脸分辨率重建方法还包括对齐所述标准梯度图像库中的图像样本。
根据本发明实施例,对齐标准梯度图像库中的图像样本包括:
从所述标准梯度图像库中选取一个图像样本作为标准图像;
根据SIFT算法求出标准图像的特征点,从中选取一个特征点集并将其存储;
将标准梯度图像库中的其余图像样本与标准图像进行对齐,包括:从标准梯度图像库中选取图像样本作为对齐图像,根据SIFT算法求出该对齐图像的特征点,从中找出与标准图像的所述特征点集最相似的特征点集;
将所述对齐图像旋转和缩放,直至其特征点集具有所述标准图像中的特征点集的等比例对应关系;
将所述对齐图像的特征点集的坐标平移至标准图像的特征点集的位置处,得到SIFT光流;
利用光流信息,将对齐图像中其余的像素点平移至标准图像上的对应位置处,完成所述对齐图像与所述标准图像的对齐。
根据本发明实施例,所述人脸分辨率重建方法还包括训练所述标准梯度图像库。
根据本发明实施例,训练所述标准梯度图像库的步骤包括:
对标准梯度图像库中的每一个图像样本进行训练,其中,每一个图像样本的训练过程包括:从标准梯度图像库中选取一个图像样本作为训练样本,对所述训练样本进行高斯平滑处理和下采样,得到第三图像,从所述第三图像获取特征掩模图,其中,特征掩模图覆盖由所述第三图像的特征点所限定的特征组件;
利用特征掩模图,在所述标准梯度图像库中以自学习的方式查找最相似的面部特征组件;
从所查找到的最相似的面部特征组件中提取特征梯度;
完成对该训练样本的训练。
根据本发明实施例,通过双三次插值法进行上采样以求取所述边缘梯度:
Figure PCTCN2017113642-appb-000004
(x,y)为待插值的像素点,(x(i),y(j))为待插值点附近的4x4邻域点,i,j=0,1,2,3,W为权重函数。
本发明实施例还提供一种人脸分辨率重建系统,包括:一个或多个处理器;一个或多个存储器,其中,所述存储器存储计算机可执行指令,所述可执行指令当由所述处理器运行时执行上述人脸分辨率重建方法。
根据本发明实施例,人脸分辨率重建系统还包含输入端和输出端,所述输入端接收具有第一分辨率的输入图像,所述输出端输出具有第三分辨率的输出图像,所述第三分辨率高于所述第一分辨率。
本发明实施例还提供一种计算机可读介质,存储有计算机可执行指令,且当所述可执行指令由处理器运行时执行上述人脸分辨率重建方法。
本发明提出一种基于机器学习的人脸分辨率重建方法,从具有较低分辨率的输入图像中计算图像的梯度信息,根据该梯度信息,从具有较高分辨率的标准梯度图像库中查找对应的面部特征组件,将获取的梯度信息融合,并叠加到输入图像以获得提升了分辨率的输出图像,在此基础上,在实现图像分辨率提升的过程中保留了人脸的整体结构信息,避免在生成的输出图像中产生局部畸变。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例的附图作简单地介绍,显而易见地,下面描述中的附图仅仅涉及本公开的一些实施例,而非对本公开的限制。
图1示出了人脸分辨率重建方法的流程框图;
图2示出了根据输入图像的图像梯度信息生成输出图像的示意图;
图3示出了标准梯度图像库中图像样本的对齐流程框图;
图4示出了图3中对齐流程的示意图;
图5示出了训练标准梯度图像库中的图像样本的流程框图;
图6示出了人脸分辨率重建系统的示意图。
具体实施方式
下面将结合本发明实施例的附图,对本发明实施例中的技术方案进行清 除、完整的描述。显然,所描述的实施例仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在无需创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。
除非另作定义,此处使用的技术术语或者科学术语应当为本发明所属领域内具有一般技能的人士所理解的通常意义。本公开中使用的“第一”、“第二”以及类似的词语并不表示任何顺序、数量或者重要性,而只是用来区分不同的组成部分。
人脸分辨率重建就是根据一个低分辨率的人脸图像,通过保真手段,生成高分辨率图像的过程,可以说这是一个特定领域的超分辨率技术。
现有的技术大多以块相似性或整幅图像的全局约束为基础。高分辨率和低分辨率图像块依据概率框架建模,从标准图像中抽取高频细节,为了从输入图像中切割出各个查询块,从标准样本集中恢复出最相似的图像块,通过计算一阶和二阶导数获取相对应的高分辨图像块。此方法会保留更多的细节,但是由于高分辨率图像块没有结构化,造成一些图像伪影的产生。
此外,还存在一种通过主成份分析PCA(Principle Components Analysis,PCA),对训练样本集进行子空间学习,利用线性约束获取高分辨率图像的技术。基于图像块的马尔可夫随机场模型重构冗余信息,恢复高频细节。由于线性子空间的限制,只有当图像表现良好(固定的姿势和表情,并且要精确的对齐)时才可能获得较好的分辨率重建效果。当图像不满足上述条件时,由于采用PCA的整体外观模型,导致结果图像常常伴随着重影产生。
以上重建方法均未能考虑输入的低分辨率图像中人脸的整体结构信息,使得输出的高分辨率图像会出现伪影、重影或局部畸变等效果。
本发明提出一种基于机器学习的人脸分辨率重建方法,在重建过程中保留了人脸的整体结构信息,结合人脸特征梯度信息和背景梯度信息,根据标准梯度图像库生成高分辨率的人脸图像,从而提升重建图像的画质质量和自适应效果。
本发明实施例提出了一种人脸分辨率重建方法,该方法可以是基于deep learning(深度学习)的神经网络(例如卷积神经网络)执行的智能算法。卷积神经网络可以采用例如AlexNet、GoogleNet、VGG、Deep Residual Learning等神经网络模型实现。
根据本发明实施例的人脸分辨率重建方法的流程框图如图1所示,在步骤S101,获取输入图像,所述输入图像具有第一分辨率。该输入图像为待提升分辨率的人脸图像,一般具有较低的分辨率,人脸的具体结构较模糊,需要补充其中的高频信息,以得到清晰的人脸图像。该人脸图像中既充分包括低分辨率图像中的信息,即与原始图像的结构保持一致,又包括补充的细节信息,此种提升人脸图像分辨率的方法也被称为“人脸幻想”。在步骤S102,基于输入图像和具有第二分辨率的标准梯度图像库,确定所述输入图像的图像梯度信息。所述标准梯度图像库中包括图像样本,所述图像样本具有第二分辨率,所述第二分辨率为与第一分辨率相对的概念,不特指某一数值,通常高于所述第一分辨率,使得人脸图像包括更多的细节信息,以根据图像库中与输入图像对应的面部特征组件生成输入图像中的人脸特征的细节信息,即根据标准梯度图像库猜想输入图像中的人脸细节。例如,所述面部特征组件通常可以指眼、耳、口、鼻等人脸面部特征,这些特征也是用来完成人脸对齐的参考特征。
具体的,重建方法还包括构建标准梯度图像库。
接着,在步骤S103,对在步骤S102中确定的图像梯度信息进行融合,并将融合所得的梯度信息叠加到所述输入图像中。
在步骤S104,生成输出图像,所述输出图像具有第三分辨率,所述第三分辨率为与第一分辨率相对的概念,其高于所述第一分辨率,并不特指某一数值。即在以上人脸分辨率重建过程,根据具有较低分辨率的输入图像,结合具有较高分辨率的标准梯度图像库中的图像样本,获得超分辨率的输出图像,该输出能清晰反映人脸的面部特征,可用于人脸监控、人脸识别、人脸图像表情分析等领域。
在本发明实施例中,所述第二分辨率高于所述第一分辨率,所述第三分辨率高于所述第一分辨率,第一、第二和第三并不表示任何顺序、数量或者重要性。例如,所述第一分辨率通常可以是低分辨率,所述第二分辨率通常可以是高分辨率。所述第三分辨率通常是超分辨率,即通过对低分辨率的输入图像实施人脸分辨率重建方法,实现超分辨率的人脸重建。
根据本发明实施例,上述步骤S102中的图像梯度信息包括边缘梯度和第二分辨率特征梯度。图2示出了根据图像梯度信息生成高分辨率的输出图 像的示意图。下面将结合图2对人脸分辨率重建方法进行详细地描述。
所述边缘梯度为输入图像中的人脸的轮廓梯度,其表征人脸的整体结构信息,通过引入边缘梯度可以在生成输出图像的过程不丢失原图的整体轮廓,以避免生成的图像中的局部畸变、伪像、重影等降低图像重建效果的现象出现。如图2所示,通过对输入的低分辨率图像进行保持方向特性的上采样来求取边缘梯度。在本发明实施例中,通过学习的统计先验方式保留边缘的结构信息和存储它的形状信息,来去除虚假的伪像,保留方向的上采样函数如下:
fk(p)=exp(-||P-Qk||σ),k=1,......K;
其中,P是以像素p为中心的像素块,Qk为像素p在方向K的相邻像素块,使用像素块而非像素是由于图块相对单个像素来说对噪声的抗干扰能力更强。即,计算结果fk(p)表示p为中心的像素块的方向,每个在该像素块中被插值的像素点保留该方向。而从数值上而言,上采样采用双三次插值:
Figure PCTCN2017113642-appb-000005
其中,(x,y)为待插值的像素点,(x(i),y(j))为待插值点附近的4x4邻域点,i,j=0,1,2,3,W为权重函数。
通过对输入图像进行上述保留方向特性的上采样得到的图像称为第一图像,其保留了输入图像中的人脸的轮廓信息,用于生成高分辨速率输出图像中的人脸轮廓信息。计算第一图像的梯度信息,作为输入图像的边缘梯度。
在本发明实施例中,按照回退方式在输入图像和标准梯度图像库的图像样本进行图像块划分。例如,可以按照从左到右、从上到下的顺序划分图像块。在划分到图像边缘时,如果剩余尺寸小于预设的图像块尺寸,则从该图像边缘起以回退方式进行划分。例如,划分到图像右边缘时,以该右边缘为基准向左回退,在划分到图像下边缘时,以该下边缘为基准向上回退。
所述第二分辨率特征梯度表征基于人脸面部特征组件的梯度信息,如图2中所示,确定所述第二分辨率特征梯度的过程包括:
(1)估计输入图像的位姿,并提取所述输入图像的特征点;
(2)基于上述位姿和特征点在标准梯度图像库中查找对应的面部特征组件;
(3)计算所述对应的面部特征组件的面部特征梯度,并确定为所述第二分辨率特征梯度。
其中,基于输入图像的位姿和特征点查找标准梯度图像库中与之对应的面部特征组件包括:根据输入图像的位姿将输入图像的位姿与图像库中的图像样本的位姿进行对齐。然后根据输入图像的特征点,在标准梯度图像库中找出与之对应的面部特征组件。所述面部特征组件通常指眼、耳、口、鼻等人脸面部特征,这些特征也是用来完成人脸对齐的参考特征。
在本发明实施例中,在标准梯度图像库中的图像样本已经过位姿对齐,具体的对齐步骤在后文中进行详细描述。以上对齐、对齐的步骤有利于更准确的进行查找,避免由于人脸特征旋转或偏移等造成查找失败,同时,对齐、对齐后的图像可用于梯度信息的融合与叠加。
上述根据位姿将输入图像与标准梯度图像库中的图像样本进行对齐,可按照以下公式进行,
Figure PCTCN2017113642-appb-000006
其中,T(x(h),y(h))为图像样本的非反射对称变换,(x(l),y(l))为上采样后的输入图像,当非反射对称变换的结果与上采样后的输入图像间的特征点坐标位置差异值达到最小时,即认为图像对齐。
经过与标准梯度图像库中的图像样本进行对齐的输入图像更准确的查找到与之对应的面部特征组件,进一步地,即能根据准确找到的面部特征组件恢复出更与原输入图像接近的人脸特征。计算查找到的面部特征组件的面部特征梯度,并确定其为所述第二分辨率特征梯度。
在本发明实施例中,所述图像梯度信息还可以包括背景梯度,该背景梯度表征输入图像中的平坦区域的梯度。获取该背景梯度包括:对输入图像进行上采样结合低通滤波的处理,并且在上采样过程中保证上采样的图像满足高斯分布,得到第二图像,所述第二图像的梯度信息作为输入图像的背景梯度信息。所述背景梯度可以与边缘梯度和第二分辨率特征梯度一起用于后续的人脸分辨率重建处理。
具体地,上述获取背景梯度的过程中可以使用小像素块去计算表情大的变化。根据自学习的统计先验方式选取所述输入图像的背景区域,以所设定 的尺度变换因子s将所述背景区域变换为具有较高分辨率图像,将背景区域的每个像素上采样为s*s个像素块。其中,通过反向传播算法保证上采样所得的图像满足高斯分布,对所得到的第二图像的背景区域进行低通滤波,并且从中提取梯度作为背景梯度。对于背景区域的处理,无需对标准梯度图像库进行查找,只需对输入图像本身的、切割出的特定部分做低分辨率至高分辨率的变换,再提取梯度即可。
综合以上,在本发明实施例中,已获得的图像梯度信息包括输入图像的边缘梯度、第二分辨率特征梯度以及背景梯度。以上三种梯度表征输入图像中不同成分的梯度信息:边缘梯度表征人脸的轮廓信息,用于保证重建图像的整体结构保持与输入图像的一致性;第二分辨率特征梯度表征人脸中具体的面部特征组件的梯度,用于在人脸分辨率重建过程中生成面部组件等的细节信息;背景梯度表征输入图像中的结构特征不明显的平坦区域的信息,用于生成的输出图像中的背景与输入图像具有一致性,进一步避免了伪像、边缘局部畸变等现象。
根据本发明实施例提供的人脸分辨率重建方法,基于步骤S102已获得输入图像的图像梯度,接着,在步骤S103中将图像梯度信息进行融合,并将融合所得的梯度信息叠加到所述输入图像中。需要理解的是,在本发明实施例中,所述图像梯度信息包括边缘梯度、第二分辨率特征梯度和背景梯度,并对以上至少两种梯度信息进行融合,在根据本发明的其他实施例中,可以包括所述图像梯度信息的一种或几种,或者是其他类型的图像梯度信息,对图像梯度信息进行融合是对图像梯度信息中的至少两种梯度信息进行融合。
在本发明实施例中,通过如下方式进行梯度信息的融合:
首先,设定信息融合后的图像区域的左上角像素值等于输入图像左上角像素值,再利用最小二乘原理求得融合信息:
Figure PCTCN2017113642-appb-000007
其中,
Figure PCTCN2017113642-appb-000008
为梯度算子,b为梯度,s为要求得的融合信息。
在本发明实施例中,是将输入图像的边缘梯度、第二分辨率特征梯度以及背景梯度进行融合以获得所述融合的梯度信息。
在本发明实施例中,步骤S103中的叠加方法可以采用相加取平均得到, 也可以采用以下公式实现。假设图像为I,梯度为g,权重参数b,b的数值范围是(0,1)的开区间,叠加后的图像为:
m=b*I+(1-b)*g
其中,b的数值一般通过经验适当地选取。
经过以上步骤获得的输出图像与输入图像相比,既具有一致性的图像信息,又具有更高的分辨率。通过在生成过程中采用边缘梯度、第二分辨率特征梯度和背景梯度等参数,使得生成的高分辨率输出图像中避免出现伪像、重影、局部畸变等现象,进一步提高了人脸重建的效果。
根据本发明实施例,所述人脸分辨率重建方法中使用的标准梯度图像库中包括一系列的具有高于输入图像的分辨率的人脸图像样本。
本发明实施例提供的人脸重建方法还包括对所述标准梯度图像库中的图像样本进行对齐。在根据标准梯度图像库确定与所述输入图像对应的面部特征组件之前,应先将图像库中的图像样本进行位姿对齐,以便于面部特征组件的准确查找以及下文中描述的对于标准梯度图像库的训练过程。
图3示出了对于图像样本的具体的对齐流程图,图4示出了图像样本对齐的示意图。以下将结合图3和图4,对图像样本之间的对齐过程进行详细的描述。
首先,如图3所示,在步骤S1,从标准梯度图像库中选取一个图像样本作为标准图像。以该标准图像的位姿为基准,将其余图像样本的位姿与之对齐。在步骤S2,根据SIFT算法求出标准图像的特征点,从中选取一个特征点集并将其存储。接着,在步骤S3,从标准梯度图像库中选取另一个图像样本作为对齐图像,根据SIFT算法求出该对齐图像的特征点,从中找出与标准图像的所述特征点集最相似的特征点集,所述特种点集用于位姿对齐。
在步骤S4,将对齐图像旋转和缩放,直至其特征点集具有标准图像中的特征点集的等比例的对应关系。具体的示意过程如图4所示。在步骤S5,对齐图像的特征点平移至标准图像的特征点集的位置处,得到SIFT光流。接着,在步骤S6,利用光流信息,将对齐图像中其余像素点平移至标准图像上的对应位置处,完成对齐图像与标准图像的对齐过程。经过对齐的图像与标准图像具有一致的位姿,能更有利于人脸分辨率重建方法中的查找步骤。
在步骤S7,判断标准梯度图像库中是否存在未与所述标准图像进行对齐 的图像样本。如果存在,则将该图像样本选取为对齐图像,并对齐执行步骤S3-S6的对齐流程。如果不存在,则结束对齐过程。当对标准梯度图像库中的图像样本进行更新时,也应对更新的图像样本进行上述对齐过程,保证标准梯度图像库中的图像样本均具有一致的位姿。
本发明实施例提供的人脸重建方法还包括对所述标准梯度图像库进行训练。该标准梯度图像库中的每张图像样本都对应着一个特征点集。具体的训练流程如图5所示,首先,在步骤S1,从该标准梯度图像库中选取一个图像样本作为训练样本,其中,该图像样本之前未被选做过训练样本。对所述训练样本进行高斯平滑处理和下采样,从而得到低分辨率的第三图像,从所述第三图像中获取特征掩模图,该特征掩模图覆盖由所述第三图像的特征点所限定的特征组件。
在步骤S2,利用在步骤S1中获得的特征掩模图,在所述标准梯度图像库中查找最相似的面部特征组件,其中,例如以自学习的方式进行查找。接着,在步骤S3,从所找到的最相似的面部特征组件中提取特征梯度。在步骤S4,判断从最相似的面部特征组件中提取的特征梯度是否与训练样本的特征梯度是相对应的,从而完成对该训练样本的训练过程。通过反复的训练,能促使在人脸分辨率重建过程中更准确的找到与输入图像更接近的人脸面部特征组件。
在步骤S5,判断所述标准梯度图像库中是否还有未被选做过训练样本的图像样本。如果有,则回到步骤S1,即选取该图像样本作为训练,进行S1-S4的训练步骤。如果没有,则结束训练。当对标准梯度图像库中的图像样本进行更新时,也可以对更新的图像样本进行上述训练过程。
以上训练过程能使得根据本发明实施例的人脸重建方法调整参数以适应当前的标准梯度图像库,即能够更高效地找到最相似的面部特征组件。例如,在查找过程中根据本发明实施例的方法根据所得到的更好的结果迭代地调整其参数。在人脸分辨率重建中,用输入的低分辨率图像即相当于此处的训练样本。
综上所述,本发明提出一种基于机器学习的人脸分辨率重建方法,在重建过程中通过人脸边缘梯度信息保留了人脸的整体结构,结合人脸特征梯度信息和背景梯度信息,根据标准梯度图像库生成高分辨率的人脸图像,从而 提升了重建图像的画质质量和自适应效果。
本发明实施例还提供一种人脸分辨率重建系统,所述人脸分辨率重建系统的结构图如图6所示。该人脸分辨率重建系统包括一个或多个处理器601和一个或多个存储器602。其中,所述存储器存储计算机可执行的指令,当所述处理器601执行存储在存储器602中的指令时,执行上述人脸分辨率重建方法。所述人脸分辨率重建系统还可以包括输入端603和输出端604。其中,所述输入端603接收具有较低分辨率的输入图像,所述输出端604输出经过人脸分辨率重建系统处理的输出图像,该输出图像具有输入图像中人脸的信息,也具有高分辨率的细节信息。
在此基础上,处理器601可以通过系统总线访问存储器602。除了存储可执行指令,存储器602还可存储训练数据等。处理器601可以为中央处理器(CPU)或图形处理器GPU等各种具有计算能力的器件。该CPU可以为X86或ARM处理器;GPU可单独地直接集成到主板上,或者内置于主板的北桥芯片中,也可以内置于中央处理器(CPU)上,由于其具有强大的图像处理能力,本发明实施例可优选地使用GPU对卷积神经网络进行训练以及基于卷积神经网络进行图像处理。
人脸分辨率重建系统还可以包括可由处理器601通过系统总线访问的数据存储。数据存储可包括可执行指令、多图像训练数据等。人脸分辨率重建系统还包括允许外部设备与人脸分辨率重建系统进行通信的输入接口。例如,输入接口可被用于从外部计算机设备、从用户等处接收指令。人脸分辨率重建系统也可包括使人脸分辨率重建系统和一个或多个外部设备相接口的输出接口。例如,人脸分辨率重建系统可以通过输出接口显示图像等。考虑了通过输入接口和输出接口与人脸分辨率重建系统通信的外部设备可被包括在提供实质上任何类型的用户可与之交互的用户界面的环境中。用户界面类型的示例包括图形用户界面、自然用户界面等。例如,图形用户界面可接受来自用户采用诸如键盘、鼠标、遥控器等之类的(诸)输入设备的输入,以及在诸如显示器之类的输出设备上提供输出。此外,自然语言界面可使得用户能够以无需受到诸如键盘、鼠标、遥控器等之类的输入设备强加的约束的方式来与人脸分辨率重建系统交互。相反,自然用户界面可依赖于语音识别、触摸和指示笔识别、屏幕上和屏幕附近的手势识别、空中手势、头部和眼睛跟踪、 语音和语音、视觉、触摸、手势、以及机器智能等。
另外,人脸分辨率重建系统尽管图中被示出为单个系统,但可以理解,人脸分辨率重建系统也可以是分布式系统,还可以布置为云设施(包括公有云或私有云)。因此,例如,若干设备可以通过网络连接进行通信并且可共同执行被描述为由人脸分辨率重建系统执行的任务。
本文中描述的各功能可在硬件、软件、固件或其任何组合中实现。如果在软件中实现,则这些功能可以作为一条或多条指令或代码存储在计算机可读介质上或藉其进行传送。
本发明实施例还提供一种计算机可读介质,其包括计算机可读存储介质。计算机可读存储介质可以是能被计算机访问的任何可用存储介质。作为示例而非限定,这样的计算机可读介质可包括RAM、ROM、EEPROM、CD-ROM或其他光盘存储、磁盘存储或其他磁存储设备、或能被用来承载或存储指令或数据结构形式的期望程序代码且能被计算机访问的任何其他介质。另外,所传播的信号不被包括在计算机可读存储介质的范围内。计算机可读介质还包括通信介质,其包括促成计算机程序从一地向另一地转移的任何介质。连接例如可以是通信介质。例如,如果软件使用同轴电缆、光纤电缆、双绞线、数字订户线(DSL)、或诸如红外线、无线电、以及微波之类的无线技术来从web网站、服务器、或其它远程源传输,则该同轴电缆、光纤电缆、双绞线、DSL、或诸如红外线、无线电、以及微波之类的无线技术被包括在通信介质的定义中。上述的组合应当也被包括在计算机可读介质的范围内。替换地或另选地,此处描述的功能可以至少部分由一个或多个硬件逻辑组件来执行。例如,可使用的硬件逻辑组件的说明性类型包括现场可编程门阵列(FPGA)、程序专用的集成电路(ASIC)、程序专用的标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑器件(CPLD)等。
以上所述仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (19)

  1. 一种人脸分辨率重建方法,包括:
    获取输入图像,所述输入图像具有第一分辨率;
    基于输入图像和具有第二分辨率的标准梯度图像库,确定所述输入图像的图像梯度信息;
    对所述图像梯度信息进行融合,将融合所得的梯度信息叠加到所述输入图像中;
    生成输出图像,所述输出图像具有第三分辨率,其中,所述第二分辨率和所述第三分辨率均高于所述第一分辨率。
  2. 根据权利要求1所述的重建方法,其中,所述图像梯度信息包括边缘梯度和第二分辨率特征梯度。
  3. 根据权利要求1-2之一所述的重建方法,其中,所述边缘梯度为人脸轮廓梯度,确定所述输入图像的边缘梯度包括:
    对所述输入图像进行上采样,并保留其方向特性,得到第一图像;
    计算第一图像的梯度信息,作为输入图像的边缘梯度信息。
  4. 根据权利要求1-2之一所述的重建方法,其中,所述第二分辨率特征梯度为标准梯度图像库中与所述输入图像对应的面部特征梯度,确定所述第二分辨率特征梯度包括:
    估计所述输入图像的位姿,提取所述输入图像的特征点;
    基于上述位姿和特征点在标准梯度图像库中查找对应的面部特征组件;
    计算所述对应的面部特征组件的特征梯度,并确定为所述第二分辨率特征梯度。
  5. 根据权利要求4所述的重建方法,其中,所述查找对应的面部特征组件包括:
    根据输入图像的位姿,将所述输入图像的位姿与所述标准梯度图像库中图像样本的位姿进行对齐;
    根据输入图像的特征点,在所述标准梯度图像库中找出与之对应的面部特征组件。
  6. 根据权利要求5所述的重建方法,其中,根据下式将所述输入图像的 位姿与所述标准梯度图像库中图像样本的位姿进行对齐:
    min∑||T(x(h),y(h))-(x(l),y(l))||2
    其中,T(x(h),y(h))为图像样本的非反射对称变换,(x(l),y(l))为上采样后的输入图像,当非反射对称变换的结果与上采样后的输入图像间的特征点坐标位置差异值达到最小时,确定图像对齐。
  7. 根据权利要求1所述的重建方法,其中,所述图像梯度信息还包括背景梯度。
  8. 根据权利要求7所述的重建方法,其中,所述背景梯度为输入图像中平坦区域的梯度,确定所述背景梯度包括:
    对所述输入图像进行上采样,得到第二图像;
    计算所述第二图像的梯度信息,作为输入图像的背景梯度信息。
  9. 根据权利要求8所述的重建方法,其中,所述第二图像的获取过程包括:
    根据自学习的统计先验方式选取所述低分辨率图像的背景区域;
    以所设定的尺度变换因子s将所述背景区域变换为高分辨率图像,其中,将背景区域的每个像素上采样为s*s个像素块;
    通过反向传播算法保证上采样所得的图像满足高斯分布。
  10. 根据权利要求1所述的重建方法,其中,对所述图像梯度信息进行融合包括:
    设定信息融合后的图像区域的左上角像素值等于输入图像左上角像素值,再利用最小二乘原理求得融合信息:
    Figure PCTCN2017113642-appb-100001
    其中,
    Figure PCTCN2017113642-appb-100002
    为梯度算子,b为梯度,s为要求得的融合信息。
  11. 根据权利要求1所述的重建方法,其中,还包括构建标准梯度图像库。
  12. 根据权利要求1所述的重建方法,其中,还包括对齐所述标准梯度图像库中的图像样本。
  13. 根据权利要求12所述的重建方法,其中,对齐标准梯度图像库中的 图像样本包括:
    从所述标准梯度图像库中选取一个图像样本作为标准图像;
    根据SIFT算法求出标准图像的特征点,从中选取一个特征点集并将其存储;
    将标准梯度图像库中的其余图像样本与标准图像进行对齐,包括:从标准梯度图像库中选取图像样本作为对齐图像,根据SIFT算法求出该对齐图像的特征点,从中找出与标准图像的所述特征点集最相似的特征点集;
    将所述对齐图像旋转和缩放,直至其特征点集具有所述标准图像中的特征点集的等比例对应关系;
    将所述对齐图像的特征点集的坐标平移至标准图像的特征点集的位置处,得到SIFT光流;
    利用光流信息,将对齐图像中其余的像素点平移至标准图像上的对应位置处,完成所述对齐图像与所述标准图像的对齐。
  14. 根据权利要求1所述的重建方法,其中,还包括训练所述标准梯度图像库。
  15. 根据权利要求14所述的重建方法,其中,训练所述标准梯度图像库的步骤包括:
    对标准梯度图像库中的每一个图像样本进行训练,其中,每一个图像样本的训练过程包括:
    从标准梯度图像库中选取一个图像样本作为训练样本,对所述训练样本进行高斯平滑处理和下采样,得到第三图像,从所述第三图像获取特征掩模图,其中,特征掩模图覆盖由所述第三图像的特征点所限定的特征组件;
    利用特征掩模图,在所述标准梯度图像库中以自学习的方式查找最相似的面部特征组件;
    从所查找到的最相似的面部特征组件中提取特征梯度。
  16. 根据权利要求3所述的重建方法,其中,通过双三次插值法进行上采样以求取所述边缘梯度:
    Figure PCTCN2017113642-appb-100003
    (x,y)为待插值的像素点,(x(i),y(j))为待插值点附近的4x4邻域点,i,j= 0,1,2,3,W为权重函数。
  17. 一种人脸分辨率重建系统,包括:
    一个或多个处理器;
    一个或多个存储器,
    其中,所述存储器存储计算机可执行指令,所述可执行指令当由所述处理器运行时执行权利要求1-16之一所述的重建方法。
  18. 根据权利要求17所述的重建系统,其中,还包含输入端和输出端,所述输入端接收具有第一分辨率的输入图像,所述输出端输出具有第三分辨率的输出图像,所述第三分辨率高于所述第一分辨率。
  19. 一种计算机可读介质,存储有计算机可执行指令,且当所述可执行指令由处理器运行时执行权利要求1-16之一所述的重建方法。
PCT/CN2017/113642 2016-11-30 2017-11-29 人脸分辨率重建方法、重建系统和可读介质 WO2018099405A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/062,339 US10825142B2 (en) 2016-11-30 2017-11-29 Human face resolution re-establishing method and re-establishing system, and readable medium
EP17876582.2A EP3550509A4 (en) 2016-11-30 2017-11-29 HUMAN FACE RESOLUTION RECOVERY PROCESS AND RECOVERY SYSTEM, AND READABLE SUPPORT

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611084243.3A CN108133456A (zh) 2016-11-30 2016-11-30 人脸超分辨率重建方法、重建设备以及计算机系统
CN201611084243.3 2016-11-30

Publications (1)

Publication Number Publication Date
WO2018099405A1 true WO2018099405A1 (zh) 2018-06-07

Family

ID=62242345

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/113642 WO2018099405A1 (zh) 2016-11-30 2017-11-29 人脸分辨率重建方法、重建系统和可读介质

Country Status (4)

Country Link
US (1) US10825142B2 (zh)
EP (1) EP3550509A4 (zh)
CN (1) CN108133456A (zh)
WO (1) WO2018099405A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490796A (zh) * 2019-04-11 2019-11-22 福建师范大学 一种高低频成分融合的人脸超分辨率处理方法及系统
CN111161356A (zh) * 2019-12-17 2020-05-15 大连理工大学 一种基于双层优化的红外和可见光融合方法
CN111353943A (zh) * 2018-12-20 2020-06-30 杭州海康威视数字技术股份有限公司 一种人脸图像恢复方法、装置及可读存储介质
CN112200152A (zh) * 2019-12-06 2021-01-08 中央广播电视总台 基于残差反投影神经网络对齐人脸图像的超分辨率方法
CN114708149A (zh) * 2022-04-18 2022-07-05 北京理工大学 一种基于深度学习算法的微波人体实时穿透式成像方法

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111328448B (zh) * 2017-12-01 2021-08-03 华为技术有限公司 图像处理的方法和设备
US10540749B2 (en) * 2018-03-29 2020-01-21 Mitsubishi Electric Research Laboratories, Inc. System and method for learning-based image super-resolution
CN113994366A (zh) * 2019-05-03 2022-01-28 华为技术有限公司 用于视频超分辨率的多阶段多参考自举
CN110310293B (zh) * 2019-07-04 2021-08-10 北京字节跳动网络技术有限公司 人体图像分割方法及设备
CN110443752B (zh) * 2019-07-25 2023-05-05 维沃移动通信有限公司 一种图像处理方法和移动终端
CN110458758B (zh) * 2019-07-29 2022-04-29 武汉工程大学 一种图像超分辨率重建方法、系统及计算机存储介质
CN110956599A (zh) * 2019-11-20 2020-04-03 腾讯科技(深圳)有限公司 图片处理方法和装置、存储介质及电子装置
CN112991165B (zh) * 2019-12-13 2023-07-14 深圳市中兴微电子技术有限公司 一种图像的处理方法及装置
CN111091158B (zh) * 2019-12-25 2024-04-30 科大讯飞股份有限公司 针对教辅图像的图像质量的分类方法、装置及设备
CN113744130B (zh) * 2020-05-29 2023-12-26 武汉Tcl集团工业研究院有限公司 一种人脸图像生成方法、存储介质及终端设备
CN111754405B (zh) * 2020-06-22 2023-08-08 北京大学深圳研究生院 图像降分辨率及复原方法、设备及可读存储介质
CN111932462B (zh) * 2020-08-18 2023-01-03 Oppo(重庆)智能科技有限公司 图像降质模型的训练方法、装置和电子设备、存储介质
CN111932463B (zh) * 2020-08-26 2023-05-30 腾讯科技(深圳)有限公司 图像处理方法、装置、设备及存储介质
CN112203098B (zh) * 2020-09-22 2021-06-01 广东启迪图卫科技股份有限公司 基于边缘特征融合和超分辨率的移动端图像压缩方法
CN112270645B (zh) * 2020-11-03 2022-05-03 中南民族大学 多阶特征循环增强的渐进高倍人脸超分辨率系统及其方法
WO2022099710A1 (zh) * 2020-11-16 2022-05-19 京东方科技集团股份有限公司 图像重建方法、电子设备和计算机可读存储介质
CN112529825B (zh) * 2020-12-11 2022-05-31 平安科技(深圳)有限公司 人脸图像分辨率重建方法、装置、设备及存储介质
CN113408347B (zh) * 2021-05-14 2022-03-15 桂林电子科技大学 监控摄像头远距离建筑物变化检测的方法
CN114240788B (zh) * 2021-12-21 2023-09-08 西南石油大学 一种面向复杂场景的鲁棒性及自适应性背景复原方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100086227A1 (en) * 2008-10-04 2010-04-08 Microsoft Corporation Image super-resolution using gradient profile prior
CN102354397A (zh) * 2011-09-19 2012-02-15 大连理工大学 基于面部特征器官相似性的人脸图像超分辨率重建方法
TW201246125A (en) * 2011-05-13 2012-11-16 Altek Corp Digital image processing device and processing method thereof
CN102968766A (zh) * 2012-11-23 2013-03-13 上海交通大学 基于字典数据库的自适应图像超分辨率重构方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9123116B2 (en) * 2011-04-19 2015-09-01 University Of Southern California Multiview face capture using polarized spherical gradient illumination
US20040246510A1 (en) * 2003-06-06 2004-12-09 Jacobsen Dana A. Methods and systems for use of a gradient operator
US20050231512A1 (en) * 2004-04-16 2005-10-20 Niles Gregory E Animation of an object using behaviors
US7411590B1 (en) * 2004-08-09 2008-08-12 Apple Inc. Multimedia file format
US8743119B2 (en) * 2011-05-24 2014-06-03 Seiko Epson Corporation Model-based face image super-resolution
US9208539B2 (en) * 2013-11-30 2015-12-08 Sharp Laboratories Of America, Inc. Image enhancement using semantic components
CN103871041B (zh) * 2014-03-21 2016-08-17 上海交通大学 基于认知正则化参数构建的图像超分辨率重构方法
US9405960B2 (en) * 2014-06-17 2016-08-02 Beijing Kuangshi Technology Co., Ltd. Face hallucination using convolutional neural networks
CN105844590A (zh) * 2016-03-23 2016-08-10 武汉理工大学 基于稀疏表示的图像超分辨率重建方法及系统
US10255522B2 (en) * 2016-06-17 2019-04-09 Facebook, Inc. Generating object proposals using deep-learning models

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100086227A1 (en) * 2008-10-04 2010-04-08 Microsoft Corporation Image super-resolution using gradient profile prior
TW201246125A (en) * 2011-05-13 2012-11-16 Altek Corp Digital image processing device and processing method thereof
CN102354397A (zh) * 2011-09-19 2012-02-15 大连理工大学 基于面部特征器官相似性的人脸图像超分辨率重建方法
CN102968766A (zh) * 2012-11-23 2013-03-13 上海交通大学 基于字典数据库的自适应图像超分辨率重构方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3550509A4 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353943A (zh) * 2018-12-20 2020-06-30 杭州海康威视数字技术股份有限公司 一种人脸图像恢复方法、装置及可读存储介质
CN111353943B (zh) * 2018-12-20 2023-12-26 杭州海康威视数字技术股份有限公司 一种人脸图像恢复方法、装置及可读存储介质
CN110490796A (zh) * 2019-04-11 2019-11-22 福建师范大学 一种高低频成分融合的人脸超分辨率处理方法及系统
CN110490796B (zh) * 2019-04-11 2023-02-14 福建师范大学 一种高低频成分融合的人脸超分辨率处理方法及系统
CN112200152A (zh) * 2019-12-06 2021-01-08 中央广播电视总台 基于残差反投影神经网络对齐人脸图像的超分辨率方法
CN112200152B (zh) * 2019-12-06 2024-04-26 中央广播电视总台 基于残差反投影神经网络对齐人脸图像的超分辨率方法
CN111161356A (zh) * 2019-12-17 2020-05-15 大连理工大学 一种基于双层优化的红外和可见光融合方法
CN111161356B (zh) * 2019-12-17 2022-02-15 大连理工大学 一种基于双层优化的红外和可见光融合方法
CN114708149A (zh) * 2022-04-18 2022-07-05 北京理工大学 一种基于深度学习算法的微波人体实时穿透式成像方法

Also Published As

Publication number Publication date
US10825142B2 (en) 2020-11-03
EP3550509A1 (en) 2019-10-09
US20180374197A1 (en) 2018-12-27
CN108133456A (zh) 2018-06-08
EP3550509A4 (en) 2020-08-05

Similar Documents

Publication Publication Date Title
WO2018099405A1 (zh) 人脸分辨率重建方法、重建系统和可读介质
WO2020199931A1 (zh) 人脸关键点检测方法及装置、存储介质和电子设备
EP3505866B1 (en) Method and apparatus for creating map and positioning moving entity
CN110287846B (zh) 一种基于注意力机制的人脸关键点检测方法
US11321593B2 (en) Method and apparatus for detecting object, method and apparatus for training neural network, and electronic device
CN108765363B (zh) 一种基于人工智能的冠脉cta自动后处理系统
CN106651938B (zh) 一种融合高分辨率彩色图像的深度图增强方法
KR102663519B1 (ko) 교차 도메인 이미지 변환 기법
CN109858333B (zh) 图像处理方法、装置、电子设备及计算机可读介质
CN108229301B (zh) 眼睑线检测方法、装置和电子设备
WO2022057526A1 (zh) 三维模型重建方法、三维重建模型的训练方法和装置
CN108734078B (zh) 图像处理方法、装置、电子设备、存储介质及程序
WO2022267311A1 (zh) 构建地貌地图的方法、装置、电子设备和可读存储介质
CN110705337A (zh) 一种针对眼镜遮挡的人脸识别方法及装置
US11631154B2 (en) Method, apparatus, device and storage medium for transforming hairstyle
KR20230132350A (ko) 연합 감지 모델 트레이닝, 연합 감지 방법, 장치, 설비 및 매체
CN117094362B (zh) 一种任务处理方法及相关装置
CN114049290A (zh) 图像处理方法、装置、设备及存储介质
WO2021098554A1 (zh) 一种特征提取方法、装置、设备及存储介质
Xu et al. 3D joints estimation of the human body in single-frame point cloud
CN116703992A (zh) 一种三维点云数据精确配准方法、装置、设备及存储介质
CN114972910B (zh) 图文识别模型的训练方法、装置、电子设备及存储介质
CN116051730A (zh) 三维血管模型的构建方法、装置及设备
KR102494811B1 (ko) 눈 특징점 기반의 실시간 시선 추적 장치 및 방법
Yang et al. Depth super-resolution with color guidance: a review

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17876582

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017876582

Country of ref document: EP

Effective date: 20190701