CN109598695B

CN109598695B - No-reference image fuzzy degree estimation method based on deep learning network

Info

Publication number: CN109598695B
Application number: CN201710909377.2A
Authority: CN
Inventors: 岳涛; 索津莉; 袁飞; 黄华; 陈鑫
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2017-09-29
Filing date: 2017-09-29
Publication date: 2023-04-07
Anticipated expiration: 2037-09-29
Also published as: CN109598695A

Abstract

The invention discloses a no-reference image fuzzy degree estimation method based on a deep learning network. The method comprises the following specific steps: (1) preprocessing the image to generate training data; (2) Training a dual-channel clear/fuzzy perception network by using clear and fuzzy image blocks, extracting clear and fuzzy features and reconstructing input; (3) The features extracted by the clear/fuzzy perception network are used as the input of the joint perception network, the fuzzy image blocks are used for training the joint perception network, and the fuzzy essential features are obtained; (4) Training a nonlinear feature mapping and regression network, and mapping the essential features in the step (3) to a fuzzy degree; and (5) fine-tuning the whole network and optimizing all parameters. The method has a good effect on the fuzzy degree estimation accuracy of the non-reference slight fuzzy, and can be effectively applied to the aspects of fuzzy detection, depth estimation, field depth editing, defogging and the like.

Description

No-reference image fuzzy degree estimation method based on deep learning network

Technical Field

The invention relates to the field of computational photography and deep learning, in particular to a reference-free image blur degree estimation method based on a deep learning network.

Background

The availability of digital cameras and networks has prompted the rapid development of photography. A large number of pictures are from non-professional human hands and are distorted, especially blurred, due to incorrect setting of the camera parameters. Blurring is a common degradation in natural images, and although it affects the visual quality and continuous processing of images, it provides rich clues for solving many visual problems and applications (e.g. blur removal, refocusing, depth estimation, segmentation). An accurate estimate of the degree of blurring is important. However, it is very tricky to estimate the non-uniform blur level from a single natural picture. Many existing ambiguity estimation methods are based on manual description features, which are mostly unreasonable or difficult to distinguish in a specific scene. Therefore, better feature extraction is required for reference-free image blur degree estimation.

Deep learning learns more useful features by constructing a machine learning model with a plurality of hidden layers and massive training data, thereby finally improving the accuracy of classification or prediction. By means of layer-by-layer feature transformation, the feature representation of the sample in the original space is transformed to a new feature space, so that classification or prediction is easier. Compared with a method for constructing the features by using manual rules, the method for constructing the features by using the big data to learn the features can depict rich intrinsic information of the data.

Therefore, how to construct and train an effective deep learning network to estimate the image blur degree is a current research direction.

Disclosure of Invention

Aiming at the defects of the existing fuzzy degree estimation method, the invention aims to provide a no-reference image fuzzy degree estimation method based on a deep learning network.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

a no-reference image fuzzy degree estimation method based on a deep learning network comprises the following steps:

step 1, generating training data: the method comprises the steps of down-sampling an image, intercepting a structured or textured clear image block in the image, and convolving the clear image block by using a Gaussian blur kernel to generate a blur image block;

step 2, respectively training two sparse self-encoders of a clear perception network and a fuzzy perception network by using the clear image blocks and the fuzzy image blocks; respectively extracting clear and fuzzy effective characteristics, and decoding and reconstructing the input;

step 3, taking the features extracted by the clear perception network and the fuzzy perception network as the input of the joint perception network, training the joint perception network by using the fuzzy image blocks, and acquiring fuzzy essential features;

step 4, training a nonlinear feature mapping and regression network, and mapping the essential features in the step 3 to a fuzzy degree;

and 5, finely adjusting the whole network and optimizing all parameters.

The invention uses a deep neural network comprising three layers of sub-networks: a two-channel clear/fuzzy perception network, a clear fuzzy joint perception network, a nonlinear feature mapping and a regression network. By adopting a learning strategy of layered training and utilizing a large amount of training data, the internal parameters of the deep network can be obtained gradually from front to back and used as prior information of the training data, so that the convergence speed of the network parameters can be greatly accelerated. Compared with the prior art, the method has a good effect on the fuzzy degree estimation accuracy of the non-reference slight fuzzy, and can be effectively applied to the aspects of fuzzy detection, depth estimation, field depth editing, defogging and the like.

Drawings

FIG. 1 is a deep learning network architecture for the method of the present invention;

FIG. 2 is a flow chart of the method of the present invention.

Detailed Description

The invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

Referring to fig. 1 and fig. 2, a method for estimating a degree of blur of a non-reference image based on a deep learning network according to this embodiment includes the following specific steps:

step 1, generating training data: converting the color image into a gray image; down-sampling the image, and reserving the most remarkable characteristics; and intercepting image blocks with structures or textures in the image, wherein the size of the image blocks is 13 × 13, and convolving the clear image blocks by using a Gaussian blur kernel to generate blurred image blocks.

Step 2, respectively training two sparse self-encoders of a clear perception network and a fuzzy perception sub-network by using the clear image blocks and the image blocks with fixed fuzziness; respectively extracting clear and fuzzy effective characteristics, and decoding and reconstructing the input.

Two fully-connected three-layer networks are established, including an input layer, a hidden layer and an output layer. In this embodiment, the number of input layer nodes is 169, the number of hidden layer nodes is 100, and the number of output layer nodes is 169. For the input image block x, the features extracted by the hidden layer and the output layer reconstruction are respectively

h＝f(W ₁ *x+b ₁ )

x'＝f(W ₂ *h+b ₂ )

Where f (x) = 1/(1 + exp (-x)) is a non-linear function, W ₁ ，W ₂ Represents a weight, b ₁ ，b ₂ Representing an offset.

Adjusting W using back-propagation methods with reduced reconstruction errors ₁ ，W ₂ And b ₁ ，b ₂ The value of (c). In order to extract the most effective features, activation of the hidden unit is limited and made sparse. A regularization term is added to penalize deviations of the hidden unit activation level from a low value. Therefore, the training of the network is equivalent to the following optimization problem:

SP(ρ||ρ _j )＝ρlog(ρ/ρ _j )+(1-ρ)log((1-ρ)/(1-ρ _j ))

where x is _i And x _i ' represents an input image block and an output reconstruction result, respectively, p represents the number of input image blocks, and SP (ρ | | ρ |, ρ) _j ) Representing a sparse penalty term, p representing the average activation of the hidden unit, p _j Representing the level of sparsity, beta controls the weight of the sparsity penalty term.

Given a clear image block x ^S And blurred image block x ^B As input, the extracted sharp and fuzzy features are respectively

Decoded reconstruction into

Parameter(s)

Representing clearly aware networks (SPN), parameters/>

Representing a fuzzy-aware network (BPN), two sub-networks were trained with two different sets of data.

And 3, taking the features extracted by the clear/fuzzy sensing network as the input of the joint sensing network, and training the joint sensing network by using the image blocks with variable fuzziness to obtain fuzzy essential features.

Combining a pre-trained Sharp Perception Network (SPN) and a fuzzy perception network (BPN) together to extract the intrinsic features in the fuzzy, which is a nonlinear mapping of different features of the sharp and the fuzzy. Defining an output of a first layer of a joint awareness network (SBBN) as

The operation of the joint awareness network (SBBN) is thus

Where h is ₃ Is the output vector of the output layer of the joint-aware network, W ₃ Represents a weight, b ₃ Indicating the bias. The joint awareness network output layer contains 169 nodes.

To train a joint awareness network, parameters are fixed

By reducing h ₃ And y ^R Inter loss, optimizing the remaining parameters (W) ₃ ,b ₃ ). The loss function is:

where n is the number of training image blocks, y ^R Is the residue between the blurred input and the original sharp image patch, representing the lost information between themAnd (4) information. The loss function is minimized by the back propagation algorithm.

And 4, training a nonlinear feature mapping and regression network, and mapping the essential features in the step 3 to the fuzzy degree.

The multi-layer neural network implements nonlinear feature mapping and regression, in which each feature mapping layer operates as:

h _i ＝f(W _i *h _i-1 +b _i )

where i =4,5,h _i Is the output vector of the feature mapping layer. The number of nodes of the two feature mapping layers is 100 and 50 respectively. The final regression layer is:

D _B ＝max(0,W ₆ h ₅ +b ₅ )

relu is used here as the activation function.

Training the whole network, estimating all network parameters W _i ,b _i I =1,2, \ 8230;, 6, parameter { W } ₁ ,W ₂ ,W ₃ ,b ₁ ,b ₂ ,b ₃ And initialized to the corresponding pre-trained parameters, and the rest parameters are initialized immediately.

And 5, fine-tuning the whole network and optimizing all parameters. The back propagation algorithm is used to calculate the gradients of all network layers, since the algorithm is applicable to any number of layers of the network.

Claims

1. A method for estimating the degree of blurring of a reference-free image based on a deep learning network is characterized by comprising the following steps:

and 5, fine-tuning the whole network and optimizing all parameters.

2. The method for estimating the degree of blur of the reference-free image based on the deep learning network as claimed in claim 1, wherein in the step 2, the specific process of extracting the sharp and blurred effective features and performing decoding reconstruction on the input is as follows:

establishing two fully-connected three-layer networks comprising an input layer, a hidden layer and an output layer; for the input image block x, the features extracted by the hidden layer and the output layer reconstruction are respectively

h＝f(W ₁ *x+b ₁ )

x'＝f(W ₂ *h+b ₂ )

Where f (x) = 1/(1 + exp (-x)) is a non-linear function, W ₁ ，W ₂ Represents a weight, b ₁ ，b ₂ Represents an offset;

adjusting the weights W using back-propagation methods with reduced reconstruction errors ₁ 、W ₂ And offset b ₁ 、b ₂ A value of (d); the training of the network is equivalent to the following optimization problem:

SP(ρ||ρ _j )＝ρlog(ρ/ρ _j )+(1-ρ)log((1-ρ)/(1-ρ _j ))

where x _i And x _i ' represents an input image block and an output reconstruction result, respectively, p represents the number of input image blocks, and SP (ρ | | ρ |, ρ) _j ) Denotes a sparse penalty term, p denotes the mean activation of the hidden unit, p _j Representing the level of sparsity, beta controls the weight of a sparse penalty term;

given a clear image block x ^S And blurred image block x ^B As input, the extracted sharp and fuzzy features are:

the decoding reconstruction is respectively as follows:

parameter W ₁ ^S ,

Representing a clearly aware network, parameter W ₁ ^B ,/>

Representing a fuzzy-aware network, two sub-networks were trained with two different sets of data.

3. The method for estimating the degree of blur of the reference-free image based on the deep learning network as claimed in claim 2, wherein the specific process of the step 3 is as follows:

combining the pre-trained clear perception network and the fuzzy perception network together to extract the inherent characteristics in the fuzzy, which is a nonlinear mapping of different clear and fuzzy characteristics, and defining the output of the first layer of the combined perception network as

Thus the operation of the joint awareness network is

Where h is ₃ Is the output layer of the joint-aware network, W ₃ Represents a weight, b ₃ Represents a bias;

to train the joint awareness network, the parameter W is fixed ₁ ^S ,

W ₁ ^B And &>

By reducing h ₃ And y ^R Inter loss, optimizing the residual parameter W ₃ And b ₃ (ii) a Wherein the loss function is:

where n is the number of training image blocks, real output y ^R Is the residue between the fuzzy input and the original sharp image block, representing the lost information between them; the loss function is minimized by the back propagation algorithm.

4. The method for estimating the degree of blur of the reference-free image based on the deep learning network as claimed in claim 3, wherein the specific process of the step 4 is as follows:

h _i ＝f(W _i *h _i-1 +b _i )

where i =4,5,h _i Is the output vector, W, of each feature mapping layer _i 、b _i Weights and biases of the feature mapping layers, respectively;

the final regression layer is:

D _B ＝max(0,W ₆ h ₅ +b ₅ )

relu is used here as the activation function;

training the whole network, estimating all network parameters W _i ,b _i }, i =1, 2.., 6, parameter { W } ₁ ,W ₂ ,W ₃ ,b ₁ ,b ₂ ,b ₃ And, initializing to the corresponding pre-trained parameters, and then initializing the rest parameters.