CN113139909A

CN113139909A - Image enhancement method based on deep learning

Info

Publication number: CN113139909A
Application number: CN202010056687.6A
Authority: CN
Inventors: 桑葛楠; 李浬; 袁峰
Original assignee: Hangzhou Oying Network Technology Co ltd
Current assignee: Hangzhou Oying Network Technology Co ltd
Priority date: 2020-01-19
Filing date: 2020-01-19
Publication date: 2021-07-20
Anticipated expiration: 2040-01-19
Also published as: CN113139909B

Abstract

The invention discloses an image enhancement method based on deep learning, which comprises the following steps: selecting images shot by professional photographers, making the professional photographers perform image retouching, constructing a neural network training data set, and dividing the neural network training data set into training sets T_trainAnd test set T_test(ii) a A neural network S (-) adopting U-Net with global characteristics; the input of which is an original image S subjected to data enhancement_InputAnd prior illumination estimation I, and output is enhanced R_outputAnd I_output(ii) a Randomly initializing relevant parameters such as weight parameters, learning rate, batch size and the like in the neural network S (-); training the image enhancement neural network model by adopting an error back propagation algorithm, and calculating loss based on the weight map so as to obtain a depth image enhancement model; compared with the prior method, the method has more natural colors, more attractiveness and better contrast, has smaller difference with the images repaired by professional diagraphers, has no artifact phenomenon, has little inference time, and can run on equipment such as a mobile phone and the like in ms level in real time.

Description

Image enhancement method based on deep learning

Technical Field

The invention relates to the technical field of image processing, in particular to an image enhancement method based on deep learning.

Background

The photography is an art of light and shadow, and a favorite photo is accurately grasped for brightness and color shading of the light and shadow. With the popularity of portable photography devices such as mini-tickets, mobile phones and the like, more and more people use photos to record their lives, start enjoying photography and enjoy sharing their photography works on social networks. In general, people often take photos of themselves due to unnatural brightness, insufficiently saturated colors, and the like. Therefore, people often spend a lot of time beautifying their images before sharing their own photographic works. Despite the large number of interactive and semi-automatic image processing tools available on the market, these tools still have a large threshold for the user, and the beautified photos are also associated with the aesthetic level of the user.

Image enhancement has been a long-standing challenge in the field of computer vision and has been of interest to scholars. The traditional image enhancement algorithm mainly comprises histogram equalization, gray world assumption, wavelet transformation algorithm, automatic white balance and the like. These algorithms focus mainly on the contrast of the image and the correction of the color. However, these algorithms are only suitable for specific conditions, such as histogram equalization without selection of data processing, automatic white balance only suitable for uniform illumination, and the like. And the processed images still have a large gap from the expectations of people. In recent years, image processing algorithms based on deep learning have been greatly successful in the field of image enhancement, but these methods still have the disadvantages of low processing speed, undesirable effect and the like.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide an image enhancement method for deep learning, which meets the requirements of real-time performance, has more natural colors, more attractiveness and better contrast compared with the prior method, has smaller difference with the image repaired by a professional repairman, and has no artifact phenomenon.

1. An image enhancement method based on deep learning is characterized by comprising the following steps:

(A) selecting images shot by professional photographers, making the professional photographers perform image retouching, constructing a neural network training data set, and dividing the neural network training data set into training sets T_trainAnd test set T_test；

(B) A neural network S (-) adopting U-Net with global characteristics;

(C) the input of which is an original image S subjected to data enhancement_InputIn combination with the a-priori illumination estimate I,the output is enhanced R_outputAnd I_output；

(D) Randomly initializing relevant parameters such as weight parameters, learning rate, batch size and the like in the neural network S (-);

(E) and training the image enhancement neural network model by adopting an error back propagation algorithm, and calculating loss based on the weight map so as to obtain a depth image enhancement model.

2. The deep learning-based image enhancement method according to claim 1, wherein in the step (B), the U-Net neural network S (-) with global features is composed of a connecting step and an expanding step. In the cont parsing step of each step, the convolution layer with 3 × 3 filter steps of 1, the batch normalization layer, and the maximum pooling layer with 2 × 2 filter steps of 2 are included. In the expanding step, each step firstly executes deconvolution, the result of the deconvolution is spliced with the feature map of the corresponding step, and then the convolution layer with 3 multiplied by 3 and step length of 1 and the batch processing normalization layer are processed by 2 filters.

3. The method for enhancing image based on deep learning as claimed in claim 2, wherein in the step (C), the data is enhanced by down-sampling the original image to a specified resolution size and performing random cropping, rotation and the like on the original image.

4. The method for enhancing image based on deep learning of claim 3, wherein in the step (E), the weight map is a formula of loss calculation, which is composed of six modules including a local loss function, a color loss function, and L₁And an MS-SSIM loss function, a VGG loss function and an illumination smoothing loss function, which are expressed as:

(1) a local loss function. Randomly cropping image blocks from the input and label images and calculating L between the input and label image blocks₁The loss, expressed as:

wherein, f (x) represents the image block generated by prediction, Y is the label image block, and n is the batch processing parameter of training.

(2) A color loss function. Expressed as:

wherein X_bAnd Y_bRepresenting X and Y gaussian blurred images, respectively.

(3)L₁And MS-SSIM loss function. Expressed as:

L₁＝||X-Y||₁ (4)

L_MS-SSIM＝1-MS-SSIM(X,Y) (5)

wherein X and Y represent the predicted image and the target image, respectively.

(4) VGG loss function. Expressed as:

is a feature map obtained from the jth convolutional layer obtained from VGG-19, parameter C_j，H_j，W_jRespectively representing the number of channels, height and width of the convolution layer of the relevant j layers, and X and Y respectively representing the predicted image and the target image.

(5) A structure-aware illumination smoothing loss function. Expressed as:

wherein

Is shown horizontally

To above and below

Gradient of (a)_tThe coefficients are the strength of the structural perceptual coefficients.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a sample diagram of a data set of an image enhancement method based on deep learning according to the present invention.

FIG. 2 is a network model diagram of the image enhancement method based on deep learning according to the present invention.

Detailed Description

In order to make the objects, embodiments and advantages of the present invention more apparent, further detailed description is given herein with reference to specific examples:

MIT-Adobe FiveK provides 5000 original images, as well as image data that was manually retouched by 5 professional reviewers (A, B, C, D, E). However, the following disadvantages of the vek data set still exist, and firstly, the data volume of the vek data set is still small, so that the training of the neural network cannot be met, overfitting is easily caused, and the requirements under a real scene cannot be met. Secondly, it has the problem of less data diversity, a significant portion of the original image data is low contrast and low brightness, and a small portion of the data is over-exposed and covers only limited lighting conditions.

We hired a photographer, using a different single lens reflex camera device, to take a total of 18000 photos, containing a very rich scene, lighting, theme, etc. We have a good professional reviewer who revises these captured original photographs to achieve satisfactory revised photograph data, as shown in fig. 1, for the purpose of presenting a partial data set sample.

(B) A neural network S (-) adopting U-Net with global characteristics;

to improve the execution efficiency of the model, we reduce the channel dimension of each convolutional layer as shown in fig. 2. The backbone network of our model is based on U-Net. It consists of a contracting step and an expanding step. In the concentrating step of each step, the convolution layer with 3 × 3 filters and 1 step size, the batch normalization layer and the maximum pooling layer with 2 × 2 filters and 2 step size are formed. In the expanding step, each step firstly executes deconvolution, the result of the deconvolution is spliced with the feature map of the corresponding step, and then the convolution layer with 3 multiplied by 3 and step length of 1 and the batch processing normalization layer are processed by 2 filters.

(C) The input of which is an original image S subjected to data enhancement_InputAnd prior illumination estimation I, and output is enhanced R_outputAnd I_output；

The Retinex theory is based on the color constancy theory. He considers that the image S of the object seen by a person is obtained by reflecting incident light I from the surface of the object, the reflectivity R is determined by the object itself and is not changed by the incident light I, and the process can be expressed by the formula:

S＝RοI (1)

although the method based on Retinex theory makes great progress in dim light enhancement and underwater image enhancement. However, in image beautification, the label object learned by the neural network is from a professional reviewer who performs artistic modification on the original image, such as modifying the color of the image to be equal. These operations break the assumption of Retinex color constancy.

Inspired by the above, we designed a decomposed neural network to transform the original image S_InputOutput as enhanced R through neural network S (-)_outputAnd I_outputWe only design oneSmooth loss of illumination as R_outputAnd I_outputAnd (c) a constraint therebetween. And reconstructing S according to the formula (1)_output。

we constructed our neural network on Tensorflow, in

The TITAN V GPU is trained, the batch processing size is 8, and 100 batches are trained. Adam optimizer was used and the learning rate was set to 5e in the first 10 batches^-3The remaining lot is set to 5e^-4. We scale the original resolution of their image pairs to 1048 on the smallest side length and randomly crop the data to 1024 when training the network.

(E) Training the image enhancement neural network model by adopting an error back propagation algorithm, and calculating loss based on the weight map so as to obtain a depth image enhancement model;

the weight graph calculation loss formula consists of six modules including local loss function, color loss function and L₁And an MS-SSIM loss function, a VGG loss function and an illumination smoothing loss function, which are expressed as:

(2) A color loss function. Expressed as:

wherein X_bAnd Y_bRepresenting X and Y gaussian blurred images, respectively.

(3)L₁And MS-SSIM loss function. Expressed as:

L₁＝||X-Y||₁ (5)

L_MS-SSIM＝1-MS-SSIM(X,Y) (6)

(4) VGG loss function. Expressed as:

(5) A structure-aware illumination smoothing loss function. Expressed as:

wherein

Is shown horizontally

To above and below

Gradient of (a)_tIntensity of coefficient being structural perception coefficient。

The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or scope of the invention as defined in the appended claims.