CN113139909B

CN113139909B - Image enhancement method based on deep learning

Info

Publication number: CN113139909B
Application number: CN202010056687.6A
Authority: CN
Inventors: 桑葛楠; 李浬; 袁峰
Original assignee: Hangzhou Oying Network Technology Co ltd
Current assignee: Hangzhou Oying Network Technology Co ltd
Priority date: 2020-01-19
Filing date: 2020-01-19
Publication date: 2022-08-02
Anticipated expiration: 2040-01-19
Also published as: CN113139909A

Abstract

The invention discloses an image enhancement method based on deep learning, which comprises the following steps: selecting images shot by professional photographers, making the professional photographers perform image retouching, constructing a neural network training data set, and dividing the neural network training data set into training sets T _train And test set T _test (ii) a A neural network S (-) adopting U-Net with global characteristics; the input of which is an original image S subjected to data enhancement _Input And prior illumination estimation I, and output is enhanced R _output And I _output (ii) a Randomly initializing relevant parameters such as weight parameters, learning rate, batch size and the like in the neural network S (-); training the image enhancement neural network model by adopting an error back propagation algorithm, and calculating loss based on the weight map so as to obtain a depth image enhancement model; compared with the prior method, the method has more natural colors, more attractiveness and better contrast, has smaller difference with the images repaired by professional diagraphers, has no artifact phenomenon, has little inference time, and can run on equipment such as a mobile phone and the like in ms level in real time.

Description

Image enhancement method based on deep learning

Technical Field

The invention relates to the technical field of image processing, in particular to an image enhancement method based on deep learning.

Background

The photography is an art of light and shadow, and a favorite photo is accurately grasped for brightness and color shading of the light and shadow. With the popularity of portable photography devices such as mini-tickets, mobile phones and the like, more and more people use photos to record their lives, start enjoying photography and enjoy sharing their photography works on social networks. In general, people often take photos of themselves due to unnatural brightness, insufficiently saturated colors, and the like. Therefore, people often spend a lot of time beautifying their images before sharing their photographic works. Despite the large number of interactive and semi-automatic image processing tools available on the market, these tools still have a large threshold for the user, and the beautified photos are also associated with the aesthetic level of the user.

Image enhancement has been a long-standing challenge in the field of computer vision and has been of interest to scholars. The traditional image enhancement algorithm mainly comprises histogram equalization, gray world assumption, wavelet transformation algorithm, automatic white balance and the like. These algorithms focus mainly on the contrast of the image and the correction of the color. However, these algorithms are only suitable for specific conditions, such as histogram equalization without selection of data processing, automatic white balance only suitable for uniform illumination, and the like. And the processed images still have a large gap from the expectations of people. In recent years, image processing algorithms based on deep learning have been greatly successful in the field of image enhancement, but these methods still have the disadvantages of low processing speed, undesirable effect and the like.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide an image enhancement method for deep learning, which meets the requirements of real-time performance, has more natural colors, more attractiveness and better contrast compared with the prior method, has smaller difference with the image repaired by a professional repairman, and has no artifact phenomenon.

1. An image enhancement method based on deep learning is characterized by comprising the following steps:

(A) selecting images shot by professional photographers, making the professional photographers perform image retouching, constructing a neural network training data set, and dividing the neural network training data set into training sets T _train And test set T _test ；

(B) A neural network S (-) adopting U-Net with global characteristics;

(C) the input of which is an original image S subjected to data enhancement _Input And prior illumination estimation I, and output is enhanced R _output And I _output ；

(D) Randomly initializing relevant parameters such as weight parameters, learning rate, batch size and the like in the neural network S (-);

(E) and training the image enhancement neural network model by adopting an error back propagation algorithm, and calculating loss based on the weight map so as to obtain a depth image enhancement model.

2. The deep learning-based image enhancement method according to claim 1, wherein in the step (B), the U-Net neural network S (-) with global features is composed of a connecting step and an expanding step. In the cont parsing step of each step, the convolution layer with 3 × 3 filter steps of 1, the batch normalization layer, and the maximum pooling layer with 2 × 2 filter steps of 2 are included. In the expanding step, each step firstly executes deconvolution, the result of the deconvolution is spliced with the feature map of the corresponding step, and then the convolution layer with 3 multiplied by 3 and step length of 1 and the batch processing normalization layer are processed by 2 filters.

3. The method for enhancing image based on deep learning as claimed in claim 2, wherein in the step (C), the data is enhanced by down-sampling the original image to a specified resolution size and performing random cropping, rotation and the like on the original image.

4. The method for enhancing image based on deep learning of claim 3, wherein in the step (E), the weight map is a formula of loss calculation, which is composed of six modules including a local loss function, a color loss function, and L ₁ And an MS-SSIM loss function, a VGG loss function and an illumination smoothing loss function, which are expressed as:

(1) a local loss function. Randomly cropping image blocks from the input and label images and calculating L between the input and label image blocks ₁ The loss, expressed as:

wherein, f (x) represents the image block generated by prediction, Y is the label image block, and n is the batch processing parameter of training.

(2) A color loss function. Expressed as:

wherein X _b And Y _b Representing X and Y gaussian blurred images, respectively.

(3)L ₁ And MS-SSIM loss function. Expressed as:

L ₁ ＝||X-Y|| ₁ (4)

L _MS-SSIM ＝1-MS-SSIM(X,Y) (5)

wherein X and Y represent the predicted image and the target image, respectively.

(4) VGG loss function. Expressed as:

is a feature map obtained from the jth convolutional layer obtained from VGG-19, parameter C _j ，H _j ，W _j Respectively representing the number of channels, height and width of the convolution layer of the relevant j layers, and X and Y respectively representing the predicted image and the target image.

(5) A structure-aware illumination smoothing loss function. Expressed as:

wherein

Is shown horizontally

To above and below

Gradient of (a) _t The coefficients are the strength of the structural perceptual coefficients.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a sample diagram of a data set of an image enhancement method based on deep learning according to the present invention.

FIG. 2 is a network model diagram of the image enhancement method based on deep learning according to the present invention.

Detailed Description

In order to make the objects, embodiments and advantages of the present invention more apparent, further detailed description is given herein with reference to specific examples:

MIT-Adobe FiveK provides 5000 original images, as well as image data that was manually retouched by 5 professional reviewers (A, B, C, D, E). However, the following disadvantages of the vek data set still exist, and firstly, the data volume of the vek data set is still small, so that the training of the neural network cannot be met, overfitting is easily caused, and the requirements under a real scene cannot be met. Secondly, it has the problem of less data diversity, a significant portion of the original image data is low contrast and low brightness, and a small portion of the data is over-exposed and covers only limited lighting conditions.

We hired a photographer, using a different single lens reflex camera device, to take a total of 18000 photos, containing a very rich scene, lighting, theme, etc. We have a good professional reviewer who revises these captured original photographs to achieve satisfactory revised photograph data, as shown in fig. 1, for the purpose of presenting a partial data set sample.

(B) A neural network S (-) adopting U-Net with global characteristics;

to improve the execution efficiency of the model, we reduce the channel dimension of each convolutional layer as shown in fig. 2. The backbone network of our model is based on U-Net. It consists of a contracting step and an expanding step. In the concentrating step of each step, the convolution layer with 3 × 3 filters and 1 step size, the batch normalization layer and the maximum pooling layer with 2 × 2 filters and 2 step size are formed. In the expanding step, each step firstly executes deconvolution, the result of the deconvolution is spliced with the feature map of the corresponding step, and then the convolution layer with 3 multiplied by 3 and step length of 1 and the batch processing normalization layer are processed by 2 filters.

(C) The input of which is an original graph enhanced by dataImage S _Input And prior illumination estimation I, and output is enhanced R _output And I _output ；

The Retinex theory is based on the color constancy theory. He considers that the image S of the object seen by a person is obtained by reflecting incident light I from the surface of the object, the reflectivity R is determined by the object itself and is not changed by the incident light I, and the process can be expressed by the formula:

S＝RοI (1)

although the method based on Retinex theory makes great progress in dim light enhancement and underwater image enhancement. However, in image beautification, the label objects learned by the neural network are from professional reviewers who make artistic modifications to the original image, such as modifying the color of the image equally. These operations break the assumption of Retinex color constancy.

Inspired by the above, we designed a decomposed neural network to transform the original image S _Input Output as enhanced R via a neural network S (-) _output And I _output We only design one illumination smoothness loss as R _output And I _output And (c) a constraint therebetween. And reconstructing S according to the formula (1) _output 。

we constructed our neural network on Tensorflow, in

The TITAN V GPU is trained, the batch processing size is 8, and 100 batches are trained. Adam optimizer was used and the learning rate was set to 5e in the first 10 batches ^-3 The remaining lot is set to 5e ^-4 . We scale the original resolution of their image pairs to 1048 on the smallest side length and randomly crop the data to 1024 when training the network.

(E) Training the image enhancement neural network model by adopting an error back propagation algorithm, and calculating loss based on the weight map so as to obtain a depth image enhancement model;

weight ofThe figure calculation loss formula consists of six modules including local loss function, color loss function and L ₁ And an MS-SSIM loss function, a VGG loss function and an illumination smoothing loss function, which are expressed as:

(2) A color loss function. Expressed as:

(3)L ₁ And MS-SSIM loss function. Expressed as:

L ₁ ＝||X-Y|| ₁ (5)

L _MS-SSIM ＝1-MS-SSIM(X,Y) (6)

(4) VGG loss function. Expressed as:

(5) A structure-aware illumination smoothing loss function. Expressed as:

wherein

Is shown on the horizontal

To the vertical

The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or scope of the invention as defined in the appended claims.

Claims

(A) selecting images shot by professional photographers, making the professional photographers perform image retouching, constructing a neural network training data set, and dividing the neural network training data set into a training set T _train And test set T _test ；

(B) A neural network S (-) adopting U-Net with global characteristics;

(C) s (-) input as the raw image S subjected to data amplification _input And a priori illumination estimation map I _input The output is the adjusted reflection chart R _output And an illumination pattern I _output And R is _output And I _output Element-wise multiplication to obtain an enhanced image S _output ；

Wherein the training process of S (-) of the neural network of U-Net of global features is as follows: randomly initializing weight parameters, learning rate and batch size related parameters in a neural network S (-) to train the neural network model by adopting an error back propagation algorithm, training the network S (-) by using an adam optimization method, calculating loss based on a weight map, and stopping training and storing the network S (-) to obtain an image enhancement model when a loss function reaches an expectation;

the weight map is used for calculating loss and consists of six modules including a local loss function, a color loss function and L ₁ And an MS-SSIM loss function, a VGG loss function and an illumination smoothing loss function, which are expressed as:

(1) a local loss function; randomly cropping image blocks from the predicted image and label images and calculating the L between image blocks ₁ The loss, expressed as:

wherein, C _x Representing cropped image blocks in a predicted image, C _y The method comprises the steps of (1) obtaining a cut image block in a label image, wherein n is a training batch processing parameter;

(2) a color loss function; the predicted image X and the tag image Y are gaussian blurred and the euclidean distance between them is calculated, and is expressed as:

wherein X _b And Y _b Respectively represent X and YImages after Gaussian blur;

(3)L ₁ and an MS-SSIM loss function; expressed as:

L ₁ ＝||X-Y|| ₁ (4)

L _MS-SSIM ＝1-MS-SSIM(X,Y) (5)

wherein X and Y represent a predicted image and a tag image, respectively;

(4) VGG loss function, expressed as:

is a feature map obtained from the jth convolutional layer obtained from VGG-19, parameter C _j ，H _j ，W _j Respectively representing the number of channels, height and width of the related j layers of convolution layers, wherein X and Y respectively represent a predicted image and a label image;

(5) a structure-aware illumination smoothing loss function; expressed as:

wherein

Is shown horizontally

To above and below

2. The deep learning-based image enhancement method according to claim 1, wherein in the step (B), the U-Net neural network with global features S (-) is specifically composed of a contracting step and an expanding step, each contracting step is composed of 2 convolution layers with 3 × 3 step size 1 filter, a batch normalization layer and a maximum pooling layer with 2 × 2 step size 2 filter, each expanding step is performed with deconvolution first, the result of deconvolution is spliced with the feature map of the corresponding step, and then the convolution layer and the batch normalization layer with 3 × 3 step size 1 filter are passed through 2 filters.

3. The method for enhancing image based on deep learning as claimed in claim 2, wherein in the step (C), the data is augmented by down-sampling the original image to a specified resolution size and performing random cropping and rotation operations on the original image.