CN114596228A

CN114596228A - Multispectral image denoising algorithm based on depth 3D convolution sparse coding

Info

Publication number: CN114596228A
Application number: CN202210212512.9A
Authority: CN
Inventors: 尹海涛; 王天由
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2022-03-04
Filing date: 2022-03-04
Publication date: 2022-06-07

Abstract

The invention discloses a multispectral image denoising algorithm based on depth 3D convolution sparse coding, which comprises the following steps: step 1, according to the inter-spectral information of the multispectral image, a convolution sparse coding model is expanded into a 3D form, and a 3D-CSC mathematical model is built; step 2, performing iterative solution on the built 3D-CSC model by adopting an iterative shrinkage soft threshold algorithm; step 3, building a deep network: expanding the iterative solution form of the 3D-CSC in the step 2 into a corresponding depth network according to the idea of depth expansion; step 4, training a network after constructing a data set; and 5, testing the performance of the network. The invention applies 3D convolution and considers the inter-spectral characteristics of the image, so that the invention obtains better denoising effect on the aspect of processing the multispectral image.

Description

Multispectral image denoising algorithm based on depth 3D convolution sparse coding

Technical Field

The invention relates to the technical field of image processing, in particular to a multispectral image denoising algorithm based on depth 3D convolution sparse coding.

Background

The multispectral image is an image containing a plurality of wave bands, and is formed by dividing an incident full-wave band or wide-wave band optical signal into a plurality of narrow-wave band light beams by a specific imaging technology and then imaging the light beams by a plurality of wave band sensors to obtain an image of the multispectral wave bands. Compared with gray level images and common color images, the multispectral image has more wave bands and contains richer spectral information, and is widely applied to multiple scenes, such as crop detection, medical tumor positioning, industrial detection and the like. However, in the multispectral image imaging process, interference from equipment, external environmental noise and the like often occurs, so that the acquired image contains noise, thereby affecting accuracy of subsequent research, such as soil component detection error, tumor positioning inaccuracy, component inspection abnormality and the like. Therefore, multispectral image denoising has important practical application value.

The task of image denoising is to reconstruct a clean image from a noisy image. The existing image denoising methods are mainly divided into a model-based method and a deep learning-based method. The main idea of the model-based method is to transform the image denoising problem into an optimal solution problem by introducing different prior information. The common prior information includes total variation constraint, sparse prior constraint and the like. The model-based method has strong interpretability, but the performance of the model-based method depends on manual parameter design to a great extent, and the flexibility is poor. In recent years, with the rise of deep learning, a deep neural network-based denoising model is gaining wide attention, such as DnCNN, CNLNet, and the like, and the core idea of these networks is to train a neural network through an end-to-end strategy and map a noise image to a clean image. Compared with a model-based method, the image denoising algorithm based on deep learning achieves great performance improvement. However, the existing deep denoising network is mainly constructed manually, and the interpretability is poor.

In order to improve the interpretability of deep neural networks, researchers have proposed deep neural networks based on deep expansion, originating from a type of iterative systolic soft threshold algorithm (litap) called learnable. The LISTA expands the iterative solution of the soft threshold algorithm to be solved for the sparse model into a deep network, wherein each layer in the deep network corresponds to each iteration. Based on the LISTA concept, depth expansion is also applied to image denoising, such as CSCNet, GroupSC, etc. The existing experimental results show that the interpretable networks have the similar performance with the classical deep denoising networks, and meanwhile, the interpretable networks have fewer parameters and stronger interpretability. However, the interpretable networks do not well depict the inter-spectral characteristics of the image, so the invention designs a multispectral image denoising network based on a depth 3D convolution sparse coding model.

Disclosure of Invention

In order to solve the above problems, the present invention provides a multispectral image denoising algorithm based on depth 3D convolution sparse coding, which can improve the interpretability of a depth neural network.

In order to achieve the purpose, the invention is realized by the following technical scheme:

the invention relates to a multispectral image denoising algorithm based on depth 3D convolution sparse coding, which comprises the following steps:

step 1, according to the inter-spectral information of the multispectral image, a convolution sparse coding model is expanded into a 3D form, and a 3D-CSC mathematical model is built;

step 2, performing iterative solution on the built 3D-CSC model by adopting an iterative shrinkage soft threshold algorithm;

step 3, building a deep network: according to the idea of depth expansion, expanding the iterative solution form of the 3D-CSC in the step 2 into a corresponding depth network 3D-CSCNet;

step 4, training a network after constructing a data set;

and 5, testing the performance of the network.

The invention is further improved in that: the specific operation of the step 1 is as follows:

step 1.1, the convolution sparse coding model represents X into a linear combination form of m convolution characteristics, and the expansion of the convolution sparse coding model to a 3D form according to the spatial structure of a multispectral image comprises the following steps:

where X represents a clean multi-spectral image,

denotes m filters of size h x h, denotes a three-dimensional convolution,

denotes the ith filter d_iCorresponding convolution characteristics;

step (ii) of1.2, convolutional sparse coding assumption

Is sparse, i.e. contains a small amount of non-zero elements, and is obtained by the following optimization problem:

wherein 1 represents l₁Norm, λ, is a regularization parameter.

The invention is further improved in that: the specific process of iterative solution in step 2 is as follows: the iterative update formula of formula (2) is:

wherein

A is the corresponding sparse code, the superscript T denotes the T (T0, 1, 2.., K-1) th iteration value, T is the transposed symbol, L is the laplace constant, λ_tFor the regularization parameter corresponding to the t-th iteration,

is a soft threshold operator defined as:

the invention is further improved in that: the step 3 comprises the following specific steps: by introducing variables

B ═ D converts formula (3) to:

expanding the formula (5) into a deep neural network, wherein the t-th layer of the neural network corresponds to the t-th iteration of the formula,

the method plays a role of an activation function in the network, and C and B are realized through convolution and continuously learn optimization in the network training process.

The invention is further improved in that: the step 4 comprises the following steps:

step 4.1, constructing a data set;

step 4.2, initializing parameters;

step 4.3, training the 3D-CSCNet;

and 4.4, reconstructing an output image.

The beneficial effects of the invention are: according to the method, 3D convolution is applied, the inter-spectral characteristics of the multispectral image are considered, the original convolution sparse coding is expanded to a 3D form, and a corresponding 3D denoising depth network is built, so that the built network has certain interpretability, and meanwhile, a good denoising effect is achieved on the aspect of processing the multispectral image.

Drawings

FIG. 1 is a network structure diagram of a 3D-CSCNet.

Detailed Description

In order to more clearly illustrate the technical solution of the present invention, the following detailed description is made with reference to the accompanying drawings:

the invention discloses a multispectral image denoising algorithm based on depth 3D convolution sparse coding, and provides a convolution sparse coding model based on a 3D convolution dictionary and solves the multispectral image denoising algorithm by utilizing an alternative iteration optimization algorithm in consideration of certain spatial structure of a multispectral image. The iterative optimization solution is then translated into a 3D-CSCNet using a depth-unwrapping strategy, and the corresponding network graph is shown in FIG. 1.

A common noisy image can be represented as:

Y＝X+N (1)

wherein,

representing a clean multi-spectral image of the image,

for noise, the present invention considers only gaussian noise,

representing a noisy image, M × N representing the spatial size of the image, and C representing the number of bands of the image. Image denoising is how to recover X from Y. The denoising method specifically comprises the following steps:

step 2, iterative solution is carried out on the built 3D-CSC model by adopting an iterative shrinkage soft threshold algorithm (ISTA);

step 3, building a deep network: according to the idea of depth expansion, expanding the iterative solution form of the 3D-CSC in the step 2 into a corresponding depth network;

step 4, training a network after constructing a data set;

and 5, testing the performance of the network.

In step 1, the convolution sparse coding model represents X as a linear combination form of m convolution characteristics, and the expansion of the convolution sparse coding model to a 3D form according to the spatial structure of a multispectral image comprises the following steps:

wherein

Representing m filters of size h x h, representing a three-dimensional convolution,

denotes the ith filter d_iCorresponding convolutionAnd (5) characterizing. Convolution sparse coding hypothesis

Is sparse, i.e. contains a small number of non-zero elements, and can be obtained by the following optimization problem:

wherein | · | purple₁Represents l₁Norm, λ, is the regularization parameter.

In the step 2, iterative solution of the original problem in the step 1 is carried out by applying an ISTA algorithm, and a sparse representation vector of each layer is obtained;

the solving process is as follows:

equation (3) the iterative update equation for ISTA is:

wherein

A is the corresponding sparse code, t denotes the t (t ═ 0, 1, 2.., K-1) th iteration value,

is a soft threshold operator, T is a transposed symbol, L is a Laplace constant, λ_tThe regularization parameter corresponding to the t-th iteration is defined as:

in step 3, a depth expansion strategy is adopted to expand the 3D-CSC model into a 3D-CSCNet, and the specific method comprises the following steps: and (3) expanding the formula (4) into a deep neural network by adopting a deep expansion idea. First, by introducing variables

Converting formula (4) to:

then, the formula (6) is expanded into a deep neural network, wherein the t-th layer of the neural network corresponds to the t-th iteration of the formula, i.e., each iteration solves each layer in the neural network,

The step 4 specifically comprises the following steps: constructing a data set, initializing parameters, training the 3D-CSCNet, and reconstructing an output image.

Constructing a data set: the network training of the invention adopts a data set CAVE, and has 32 scenes in total, wherein each scene has 31 gray images with the size of 512 multiplied by 512, and the 31 wave bands correspond to the scene. Therefore, the 31 gray images are sequentially synthesized into a three-dimensional tensor with the size of 512 × 512 × 31. Before reading data, the three-dimensional tensors are cut according to the size of 32 x 32, the step length of a cutting interval is 31 pixel values, the synthesized tensors are stored into a mat file for importing data in the following process, and finally the cut data are normalized according to the ratio of 6: 4, dividing the data into a training set and a testing set, wherein the gray scale range of the data after normalization processing is [0, 1 ].

Initializing parameters: inputting a multispectral image Y, randomly initializing a 3D dictionary D, and regularizing a parameter lambda;

training 3D-CSCNet: the invention simulates a noisy image by manually adding gaussian noise. Network parameters of the whole 3D-CSCNet are Θ ═ C, B, D, λ, and these parameters can be obtained by training the network in a form of end-to-end supervised learning. The loss function employed in the present invention is a Mean Square Error (MSE) loss function defined as

Where M is the number of training samples, f_Θ3D-CSCNet, Y with the parameter theta_jRepresents the jth training sample, f_Θ(Y_j) Is an estimate of 3D-CSCNet. The method realizes 3D-CSCNet on PyTorch, trains and tests on RTX2070Super GPU, sets the number of samples (Batch Size) and the number of training rounds (epoch) captured at one time of training to be 8 and 200, sets the Size of a convolution kernel to be 7 multiplied by 7, and adopts an Adam optimizer. When training the 3D-CSCNet, the stack expansion number K is set, and when solving the sparse code corresponding to the t (t ═ 0, 1, 2.., K) layer, equation (6) is applied, and it is known that the set iteration is ended.

Reconstructing an output image: from the final A^kAccording to

And reconstructing the denoised image.

Step 5, the concrete method for testing the model comprises the following steps: the network performance is evaluated under three different Noise levels of [ 0.1 ], 0.2 and 0.3 ], and the evaluation index adopts average Peak Signal-to-Noise Ratio (MPSNR). Comparative methods include LRMR, LRTV, LLRGTV and CSCNet, and the results are shown in table 1:

TABLE 1 MPSNR results of different methods on CAVE test set

Algorithm	σ＝25	σ＝50	σ＝75
				LRMR	33.48	28.44	25.00
LRTV	36.21	32.03	29.41
				LLRGTV	34.40	29.71	26.82
CSCNet	35.36	32.50	30.21
				3D-CSCNet	37.40	33.53	25.69

As can be seen from Table 1, the 3D-CSCNet of the invention has better denoising performance. Notably, 3D-CSCNet performed better on CAVE, a dataset with sigma values of 25 and 50, over the common noise, compared to CSCNet, which also demonstrates the effectiveness of the model.

Claims

1. The multispectral image denoising algorithm based on the depth 3D convolution sparse coding is characterized by comprising the following steps: the method comprises the following steps:

step 4, training a network after constructing a data set;

and 5, testing the performance of the network.

2. The depth 3D convolutional sparse coding based multispectral image denoising algorithm of claim 1, wherein: the specific operation of the step 1 is as follows:

step 1.1, the convolution sparse coding model represents an input signal X into a linear combination form of m convolution characteristics, and the expansion of the input signal X into a 3D form according to the spatial structure of a multispectral image comprises the following steps:

where X represents a clean multi-spectral image,

denotes the ith filter d_iCorresponding convolution characteristics;

step 1.2, convolutional sparse coding assumption

wherein | · | charging₁Represents l₁Norm, λ, is a regularization parameter.

3. The multispectral image denoising algorithm based on depth 3D convolutional sparse coding of claim 2, wherein: the specific process of iterative solution in step 2 is as follows: the iterative update formula of formula (2) is:

wherein

is a soft threshold operator defined as:

4. the depth 3D convolutional sparse coding based multispectral image denoising algorithm of claim 3, wherein: the step 3 comprises the following specific steps: by introducing variables

B ═ D converts formula (3) to:

expand equation (5) intoA deep neural network, wherein the t-th level of the neural network corresponds to the t-th iteration of the formula,

5. The depth 3D convolutional sparse coding based multispectral image denoising algorithm of claim 1, wherein: the step 4 comprises the following steps:

step 4.1, constructing a data set: and synthesizing images in each scene into a three-dimensional tensor according to the adopted data set CAVE, cutting the three-dimensional tensor according to requirements, and performing normalization processing according to the following steps of 6: 4 into a training set and a test set;

step 4.2, initializing parameters;

step 4.3, training the 3D-CSCNet;

and 4.4, reconstructing an output image.