CN112669207A

CN112669207A - Method for enhancing resolution of face image based on television camera

Info

Publication number: CN112669207A
Application number: CN202011521115.7A
Authority: CN
Inventors: 谢涛; 邹军; 高岚
Original assignee: Sichuan Changhong Electric Co Ltd
Current assignee: Sichuan Changhong Electric Co Ltd
Priority date: 2020-12-21
Filing date: 2020-12-21
Publication date: 2021-04-16

Abstract

The invention discloses a method for enhancing the resolution of a face image based on a television camera, which comprises the following basic processes: when the camera has a face and the ambient light is dark or the distance from the face to the camera is less than a set threshold value, starting a face resolution enhancement mode; in the mode of entering the human face resolution enhancement, firstly, the recognized human face is cut, the image quality detection is evaluated by using a Tenengrad gradient method for the image only containing the human face, the image with the gradient value beta smaller than a certain threshold value is defined as a fuzzy image, the image is used as the input of a super-resolution network, and the difference image and the low-resolution image are combined through network reasoning to generate a high-resolution image meeting the requirement. The method gives consideration to both accuracy and real-time performance, lays a foundation for the recognition of low-quality images, and ensures the precision of subsequent algorithms.

Description

Method for enhancing resolution of face image based on television camera

Technical Field

The invention relates to the technical field of image processing, in particular to a method for enhancing the resolution of a face image based on a television camera.

Background

In recent years, with the continuous development of artificial intelligence, more and more traditional industries realize new industry growth points under the assistance of artificial intelligence. At present, for a television, people do not meet the traditional function of simply watching television programs, but hope that the television is more intelligent. The combination of machine vision and traditional TV set has expanded man-machine mutual mode, promotes user experience.

At present, the mainstream functions realized by a television set with a camera in the market are functions of shooting, photographing, face recognition, video call and the like. All of which rely on high quality image input. In the actually collected image, there are often blurred images due to insufficient light, long distance, and the like. In the existing image enhancement method, interpolation processing is usually directly carried out on a low-resolution image, and an image is generated by a pixel replacement method or a GAN network, wherein the former method is simple, but the interpolation method easily causes a sawtooth phenomenon on a human face image with rich details and has a poor enhancement effect; the latter is to process the whole image, resulting in a great variety of network parameters and a slow model reasoning speed.

Disclosure of Invention

In order to solve the technical problem, the invention provides a method for enhancing the resolution of a facial image based on a television camera, which trains the difference between a high-resolution image and a low-resolution image, combines the difference with a low-resolution original image and generates a final high-resolution facial image. Under the condition of limited computing resources, the algorithm processing efficiency is improved as much as possible, and the requirements of project accuracy and instantaneity are met.

In order to achieve the technical effect, the invention adopts the following technical scheme:

a face image resolution enhancement method based on a television camera comprises the following steps:

(1) judging whether a face resolution enhancement mode needs to be started: in a picture with a human face, when the illumination intensity lx of ambient light is less than 50 or the distance l from the human face to the camera is less than 280cm, a human face resolution enhancement mode is started;

(2) recognizing a face image;

(3) standardizing a face image;

(4) evaluating the image quality of the human face; the method comprises the following steps of (1) evaluating the definition of a face by using a Tenengrad gradient method, and defining a face image with the gradient value g less than 10.0 as a blurred image;

(5) performing super-resolution enhancement processing; and taking the fuzzy image as the input of the super-resolution network, and combining the difference image and the low-resolution image through network reasoning to generate a high-resolution image meeting the requirement.

The further technical scheme is that the step (2) specifically adopts a third-party face recognition technology to obtain the coordinates of the face surrounding frame, and the face image is cut according to the coordinates.

The further technical scheme is that the step (3) is specifically to uniformly scale the human face area cut in the step (2) into an image with a standard size for input.

According to a further technical scheme, the standard size is 96 × 96 px.

The method comprises the following steps that a Tenengrad gradient method is adopted to evaluate the definition of the detected face, the Tenengrad gradient method utilizes a Sobel operator to calculate the gradients in the horizontal direction and the vertical direction respectively, the higher the gradient value is, the clearer the image is, and the face image with the gradient value g of less than 10.0 is defined as a blurred image needing resolution enhancement processing.

The further technical scheme is that the super-resolution network in the step (5) is a deep convolutional neural network, and the training process of the network is as follows:

A. 10000 clear face images are collected and sorted as sample labels, are subjected to fuzzy processing manually to be used as training samples, and are uniformly set to be 96 × 3px in image size;

B. in order to enrich the sample set, the sample and the label image are synchronously cut, rotated and processed by light and shade change, so that the aim of amplifying the data set is fulfilled;

C. sending the sample into a super-resolution network for training, and calculating a loss value;

D. if the loss of the model is reduced to 0.1 or the training steps reach a certain number, finishing the training; otherwise, repeating the step C.

Compared with the prior art, the invention has the following beneficial effects: the invention provides a new method for enhancing the resolution of a face image, which trains the difference between high-resolution images and low-resolution images, combines the difference with low-resolution original images and generates a final high-resolution face image. Under the condition of limited computing resources, the algorithm processing efficiency is improved as much as possible, the requirements of project accuracy and real-time performance are met, the method gives consideration to both accuracy and real-time performance, a foundation is laid for the recognition of low-quality images, and the accuracy of subsequent algorithms is guaranteed.

Drawings

FIG. 1 is a flow chart of the present invention;

fig. 2 is a diagram of a super-resolution network structure.

Detailed Description

Example 1

In the deep learning field, deeper and wider networks can achieve better effects, but a complex network structure can improve performance, but at the cost of loss efficiency. The super-resolution network structure is mainly designed, and the super-resolution network structure is designed in the network width and the network depth, so that the network precision is guaranteed, and meanwhile, the real-time requirement is met by reducing the network structure.

Specifically, the invention relates to a method for enhancing the resolution of a face image based on a television camera, which comprises the following steps:

step1, judging a face resolution enhancement mode; in a picture with a human face, when the illumination intensity of ambient light (lx <50) or the distance from the human face to a camera (l <280cm), a human face resolution enhancement mode is started;

step2, identifying the face image; by adopting a third-party face recognition technology, the coordinates (the upper left corner (p0x, p0y) and the lower right corner (p1x, p1y)) of a face bounding box can be obtained, and the face image is cut according to the coordinates;

step3, face image standardization: uniformly scaling the face region cut in Step2 into 96x96 standard size image input;

step4, evaluating the image quality of the human face; the definition of the detected face is evaluated by adopting a Tenengrad gradient method, the Tenengrad gradient method utilizes a Sobel operator to calculate the gradients in the horizontal direction and the vertical direction respectively, and the higher the gradient value is, the clearer the image is. Defining the face image with gradient value (g <10.0) as the image to be processed needing resolution enhancement processing;

step5, super-resolution enhancement processing; and taking the fuzzy image in Step4 as the input of a super-resolution network, and combining the difference image and the low-resolution image through network reasoning to generate a high-resolution image meeting the requirement.

The image super-resolution model after pre-training is a deep convolutional neural network, and the training process of the network is as follows:

A. collecting and sorting about 10000 clear face images as sample labels, manually carrying out fuzzy processing on the sample labels to serve as training samples, and uniformly setting the sample labels to be 96 × 3 image sizes;

D. if the model loss is reduced to 0.1 or the training steps reach a certain number (2 ten thousand), the training is finished. Otherwise, repeating the step C.

Although the present invention has been described herein with reference to the illustrated embodiments thereof, which are intended to be preferred embodiments of the present invention, it is to be understood that the invention is not limited thereto, and that numerous other modifications and embodiments can be devised by those skilled in the art that will fall within the spirit and scope of the principles of this disclosure.

Claims

1. A face image resolution enhancement method based on a television camera is characterized by comprising the following steps:

(2) recognizing a face image;

(3) standardizing a face image;

2. The method for enhancing the resolution of the facial image based on the television camera as claimed in claim 1, wherein the step (2) is specifically to adopt a third-party face recognition technology to obtain the coordinates of the face bounding box and cut the facial image according to the coordinates.

3. The method for enhancing the resolution of the facial image based on the television camera as claimed in claim 2, wherein the step (3) is specifically to uniformly scale the facial region cropped in the step (2) to the standard size image input.

4. The method according to claim 3, wherein the standard size is 96x96 px.

5. The method for enhancing the resolution of the facial image based on the television camera as claimed in claim 1, wherein the step (4) is specifically to perform the sharpness evaluation on the detected face by using a Tenengrad gradient method, the Tenengrad gradient method uses a Sobel operator to calculate the gradients in the horizontal and vertical directions respectively, the higher the gradient value is, the sharper the image is, and the facial image with the gradient value g <10.0 is defined as the blurred image which needs the resolution enhancement processing.

6. The method for enhancing the resolution of facial images based on a television camera as claimed in claim 1, wherein the super-resolution network in step (5) is a deep convolutional neural network, and the training process of the network is as follows: