CN110503608B

CN110503608B - Image denoising method based on multi-view convolutional neural network

Info

Publication number: CN110503608B
Application number: CN201910632475.5A
Authority: CN
Inventors: 徐勇; 孙利雷
Original assignee: Guizhou University
Current assignee: Guizhou University
Priority date: 2019-07-13
Filing date: 2019-07-13
Publication date: 2023-08-08
Anticipated expiration: 2039-07-13
Also published as: CN110503608A

Abstract

The invention discloses an image denoising method based on a multi-view convolutional neural network. The invention uses INBN technology to replace BN to accelerate the convergence of the denoising network; the method can effectively supplement the defects of BN and accelerate network convergence, and can process real noise images, blind noise and Gaussian noise. The invention only uses 20 layers of networks to carry out denoising, thereby reducing the calculation cost of the networks. In addition, the invention uses a new technology GF technology to better transform linear data into nonlinear data; the denoising model is better trained with a Smooth function. In addition, it employs feature fusion from multiple perspectives to enhance network performance. The invention has important significance for disaster relief, aviation exploration and medical diagnosis in reality.

Description

Image denoising method based on multi-view convolutional neural network

Technical Field

The invention relates to the technical field of image processing, in particular to an image denoising method based on a multi-view convolutional neural network.

Background

Digital image devices have been widely used in various fields such as disease diagnosis, personal identification, and disaster relief. However, when the image device is used to take pictures, the pictures are often affected by camera shake, low light, rainy days, etc., which causes the taken pictures to be unclear, the image denoising technology is to restore the unclear images to high-definition images, this process is called a reversible process, and the denoising mainly depends on a method of y=x+μ, where y is a noisy image, x is a restored clean image, and μ is noise. From the bayesian perspective, a priori knowledge is critical to image denoising, while some students have done much work in this respect, such as sparse methods are very robust to image denoising tasks. The sparse method is optimized by non-logic adaptivity and is applied to the denoising task. Dictionary learning is used to remove noise and also effectively reduce computational costs. The total variation regularization method can make the image smoother, which is beneficial to restoring a clean image. Furthermore, markov, weighted kernel norm minimization and 3-dimensional block matching filter methods are the dominant denoising methods. While these methods have achieved good performance in image denoising, they still face the following challenges:

(1) These methods require manual parameter settings to obtain optimal performance;

(2) These methods require complex optimization algorithms at the test stage, which greatly increases the computational cost of these methods;

(3) This approach only trains a model for one case to solve this problem, e.g. gaussian noise with a noise level of 25, and they only have a model to solve this problem, and noise images in life are complex, which greatly limits their application range.

In recent years, deep learning has become more popular for reasons of image processing units (Graphic Processing Unit, GPUs) and big data. Among them, convolutional neural network (Convolutional Neural Network) is a typical deep learning technique, which is also called as more and more popular for the following reasons:

(1) The structure of CNN is end-to-end connection, it is very flexible, it can set up the structure to the characteristic of the task;

(2) CNNs can rely on basic plugins, including linear correction units (Rectifier Linear Unit, reLU) and convolutional layers (Conv);

(3) CNN relies on GPU to perform parallel computation, greatly improving the operation efficiency.

The CNN has strong self-learning capability, does not need manual parameter adjustment, and can rapidly process images by means of a GPU, so that the CNN is also an effective image processing recovery method, such as: the SRCNN network uses three layers to process super resolution tasks. Although it is better than the traditional method in super-resolution task, the network depth is reduced when exceeding three layers of performances, so the method lacks flexibility. Subsequently, CNN has also made a breakthrough in image denoising, such as DnCNN first uses CNN in image denoising and uses a model to handle multiple tasks, gaussian denoising, super resolution and restoring compressed images. Ffdnat uses the noise map and the noise image together as inputs to the denoising network, and the method effectively processes blind noise. IRCNN combines the optimization algorithm and the discrimination method for the first time, and the method has a certain meaning in processing real noise. The MLWC combines spatial domain features with CNN to solve the tasks of super resolution, denoising and the like. He Kaiming proposes to reuse + operation to improve the image denoising performance. All of the above approaches have made some progress in image denoising, but the following challenges remain unsolved:

(1) Most of the methods use BN technology, the BN technology is very dependent on the size of the batch, and when the batch is large, the BN technology has better performance; when the batch is smaller, the BN technical performance is degraded. Therefore BN technology does not have very good robustness;

(2) The above-mentioned method cannot use one model to process multiple tasks such as real noise, gaussian noise and blind noise;

(3) The above-mentioned methods are partly by deepening the network layer number, and partly by repeatedly utilizing + operation to raise the denoising performance, which greatly increases the calculation cost of the network.

Disclosure of Invention

The invention aims to solve the technical problem of providing an image denoising method based on a multi-view convolutional neural network, which can effectively supplement the defects of BN and accelerate network convergence and can process real noise images, blind noise and Gaussian noise. Has important significance for disaster relief, aviation exploration and medical diagnosis in reality.

The invention is realized in the following way: the image denoising method based on the multi-view convolutional neural network comprises the following steps of:

1) After the features of the original image are extracted by an FFT algorithm, image reconstruction is carried out, and the reconstructed image and the original image are segmented and then are used as the input of a denoising network;

2) And after the reconstructed image and the original image are segmented through a denoising network, adding the output corresponding to the FFT with the output of the original image corresponding network, and outputting a denoised clean image.

The denoising network consists of 19 layers, wherein the 1 st layer consists of a convolution layer and a ReLU; layers 2-18 consist of convolutional layers, INBN and GF functions, and layer 19 consists of convolutional layers; and adding the obtained network output corresponding to the FFT with the network output corresponding to the original image through denoising network processing, and taking the added network output as the input of the 19 th convolution layer.

The input size of the network is 256 multiplied by 1 multiplied by 40, the output size is 256 multiplied by 1 multiplied by 40, the convolution size is 3 multiplied by 3, wherein 256 multiplied by 1 multiplied by 40 is represented by the batch size being 256, the output channel is 1, and the height and the width are 40.

The INBN layer is formed by combining half of channels of a convolution layer through IN and the other half of channels of the convolution layer through BN by +operation; wherein IN is represented by formula (1):

in the formula (1), μ is an average value, σ is a standard deviation, ε is a constant, and H is a height W is a depth;

BN is represented by formula (2):

(2) The representative average (3) represents the variance (4) represents the normalization (5) represents the reconstruction of the data.

The GF function is GF (x) =relu (x) ×tanh (x), where ReLU is Φ (x) =max (0, x), and the function of ReLU is to convert the linear converted data into nonlinear data.

Wherein, tanh (x) is

The denoising model is trained using a smoothfunction as an objective function, the smoothfunction being as shown in equation (6):

and (3) image reconstruction: after extracting the characteristics of the image, multiplying the characteristics by the parameters of model training to obtain a characteristic diagram. The feature map is distinguished from the original map, and the feature map is the extraction of the key of the original map, for example: a human face image is extracted by the feature, and an image band can be reconstructed to represent the whole human face image by using the features of eyes, nose, mouth and the like.

Image blocking: the whole image is divided into a plurality of small blocks, so that direct local characteristics of the image can be extracted quickly.

The input is to input the FFT feature image and the original image, and to divide the feature image and the original image into blocks.

The invention uses INBN technology to replace BN to accelerate the convergence of the denoising network; the method can effectively supplement the defects of BN and accelerate network convergence, and can process real noise images, blind noise and Gaussian noise. The invention only uses 20 layers of networks to carry out denoising, thereby reducing the calculation cost of the networks. In addition, the invention uses a new technology GF technology to better transform linear data into nonlinear data; the denoising model is better trained with a Smooth function. In addition, it employs feature fusion from multiple perspectives to enhance network performance. The invention has important significance for disaster relief, aviation exploration and medical diagnosis in reality.

Drawings

FIG. 1 is an overall flow chart of a network of the present invention;

FIG. 2 is a block diagram of a denoising network according to the present invention;

FIG. 3 is an image of the original noise image and the FFT extracted spatial domain features according to an embodiment of the present invention;

FIG. 4 is a block diagram of 2 blocks in an original noise block image according to an embodiment of the present invention;

FIG. 5 is a 2-block image of an FFT image block image according to an embodiment of the invention;

fig. 6 is a clean image 2 block image of an embodiment of the present invention.

Fig. 7 is a comparison of an original image, an image of FFT extracted spatial features, and a restored clean image in an embodiment of the present invention.

Detailed Description

Embodiments of the invention: the image denoising method based on the multi-view convolutional neural network comprises the step of taking a Smooth function as an objective function of training a denoising network. Take gaussian noise level 75 as an example.

The method comprises the following steps:

1) The original image is subjected to feature extraction by FFT and then subjected to image reconstruction, the reconstructed image is segmented, and the original image is segmented to be used as the input of a network, as shown in figure 3.

The network is composed of 19 layers, wherein layer 1 is composed of a convolution layer and a ReLU; layers 2-18 consist of convolutional layers, INBN and GF functions, and layer 19 consists of convolutional layers; step 1 is performed using a "+" operation, and features obtained in fig. 2 are fused together and then connected to the layer 19 convolution layer.

The INBN layer is formed by combining half of channels of a convolution layer through IN and the other half of channels of the convolution layer through BN after the channels of the convolution layer pass through cat+ operation; wherein IN is represented by formula (1):

BN is represented by formula (2):

Wherein, tanh (x) is

The invention uses the Smooth function as an objective function to train the denoising model, and the Smooth can enable the image to be smoother.

Claims

1. The image denoising method based on the multi-view convolutional neural network is characterized by comprising the following steps of:

2) After the reconstructed image and the original image are segmented through a denoising network, adding the network output corresponding to the FFT and the network output corresponding to the original image, and outputting a denoised clean image;

the denoising network consists of 19 layers, wherein the 1 st layer consists of a convolution layer and a ReLU; layers 2-18 consist of convolutional layers, INBN and GF functions, and layer 19 consists of convolutional layers; adding the obtained network output corresponding to the FFT with the network output corresponding to the original image through denoising network processing, and taking the added network output as the input of a layer 19 convolution layer;

BN is represented by formula (2):

2. The multi-view convolutional neural network-based image denoising method as claimed in claim 1, wherein: the input size of the network is 256 multiplied by 1 multiplied by 40, the output size is 256 multiplied by 1 multiplied by 40, the convolution size is 3 multiplied by 3, wherein 256 multiplied by 1 multiplied by 40 is represented by the batch size being 256, the output channel is 1, and the height and the width are 40.

3. The multi-view convolutional neural network-based image denoising method as claimed in claim 1, wherein: the GF function is GF (x) =ReLU (x) ×Tanh (x), wherein ReLU is phi (x) =max (0, x), and the function of ReLU is to perform nonlinear transformation on the data and increase the nonlinear characterization capability of the data, wherein tan h (x) is

4. The multi-view convolutional neural network-based image denoising method as claimed in claim 1, wherein: the denoising model is trained using a smoothfunction as an objective function, the smoothfunction being as shown in equation (6):