CN116051396B

CN116051396B - Image denoising method based on feature enhancement network and GRU network

Info

Publication number: CN116051396B
Application number: CN202211382433.9A
Authority: CN
Inventors: 郭丽丽; 王琦栋; 丁世飞
Original assignee: China University of Mining and Technology CUMT
Current assignee: China University of Mining and Technology CUMT
Priority date: 2022-11-07
Filing date: 2022-11-07
Publication date: 2023-07-21
Anticipated expiration: 2042-11-07
Also published as: CN116051396A

Abstract

The invention provides an image denoising method based on a characteristic enhancement network and a GRU network, which is used for collecting images without noise and images polluted by noise with different degrees to form a data set respectively; preprocessing the acquired data set; building a characteristic enhancement network, a backbone network, two layers of different GRU networks and a reconstruction network, and sequentially connecting the two networks to form an initial denoising model; setting the maximum training round number, a convergence threshold value and other super parameters required by training; adjusting and preparing a training data set and a test data set, and starting end-to-end training of the initial denoising model; the initial denoising model inputs a noisy image during training and outputs a denoised image; after the loss function converges and reaches the maximum training round, the model training is completed; saving the trained model to form an image denoising model; and denoising the image by using the saved model. The method can remarkably improve the denoising performance of the model.

Description

Image denoising method based on feature enhancement network and GRU network

Technical Field

The invention belongs to the technical field of computer vision, and particularly relates to an image denoising method based on a feature enhancement network and a GRU network.

Background

An image is a very important information carrier in which rich data is contained, but is vulnerable to contamination by noise and loss of detail. The presence of noise affects the quality of an image, especially in a practical application scene, real noise tends to be more complex, the harm to the image quality is generally larger, and even if only a small amount of noise exists, serious influence is possibly brought, so that the removal of the noise to obtain a high-quality image is necessary, and the removal of the noise is an important task in the field of computer vision.

Traditional digital image denoising techniques can be largely divided into two categories: filter-based methods and mathematical model-based methods. Filter-based methods are commonly found in the digital image processing field, and mainly include techniques such as mean filtering, median filtering, wiener filtering, and the like. The mathematical model-based method aims at modeling the distribution of natural images and noise and mainly comprises a non-local similarity model, a sparse model, a gradient model and the like. Although the conventional method can achieve a certain denoising effect, the following three disadvantages exist in most cases: the denoising performance can be obviously reduced under the condition of higher noise level and serious pollution; the parameters are required to be manually adjusted, and the iteration process is complex; the texture protection of the image is insufficient in the denoising process, and the image structure is easy to damage.

The image denoising technology based on the convolutional neural network mainly extracts features of a noise image, reconstructs a restored image after feature information of the image is obtained, or directly predicts noise information contained in the noise image, and then uses pixel points to do subtraction to separate noise from the noise image so as to obtain a clean image. The convolutional neural network-based image denoising technology can obtain good denoising effect under the support of sufficient data, and can meet the requirement of batch processing.

However, the denoising method based on convolutional neural network still has the following problems: the traditional neural network method omits the extraction of the shallow characteristic information of the image, so that the shallow information is insufficiently utilized, the denoising performance of the model is limited, the characteristic information cannot be well fused, and the characteristic information is easy to lose.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides an image denoising method based on a feature enhancement network and a GRU network, which can obtain richer shallow features of an image, fully utilize shallow information, well integrate feature information, effectively reduce loss of the feature information and remarkably improve the denoising performance of a model.

In order to achieve the above object, the present invention provides an image denoising method based on a feature enhancement network and a GRU network, comprising the steps of:

step one, data acquisition;

collecting images without noise and images polluted by noise with different degrees to form data sets respectively, and constructing the data sets required by model training;

step two, data preprocessing;

preprocessing the acquired data set to expand and strengthen the acquired data set;

thirdly, constructing a model;

building a characteristic enhancement network, a backbone network, two layers of different GRU networks and a reconstruction network, and sequentially connecting the two networks to form an initial denoising model;

the characteristic enhancement network is named EBlock and comprises 4 hole convolution layers with expansion factors of 2, 1 convolution layer with convolution kernel size of 1, 1 convolution layer with convolution kernel size of 3, 1 convolution layer with convolution kernel size of 5, 1 convolution layer with convolution kernel size of 7, 1 channel fusion layer, 1 coding layer and 1 decoding layer; wherein, 4 hole convolution layers are combined with batch normalization and ReLU activation functions, the convolution layers of 4 different convolution kernels are combined with batch normalization and PReLU activation functions, the 4 hole convolution layers and the convolution layers of 4 different convolution kernels form 4 branch networks respectively, and the 4 branch networks, the 1 channel fusion layer, the 1 coding layer and the 1 decoding layer are connected in sequence;

the backbone network comprises 1 combination group formed by a convolution layer and a ReLU activation layer, 16 combination groups formed by a convolution layer, a batch normalization layer and a PReLU activation layer, and the 1 combination group and the 16 combination groups are connected in sequence; meanwhile, global residual error connection is arranged between the input and the output of the backbone network;

the construction of the two layers of GRU networks is based on convolution GRU, the construction is named ConvGRU1 and ConvGRU2 respectively, the ConvGRU1 and ConvGRU2 are sequentially connected, the GRU network combines a ReLU activation function, a Sigmoid activation function, a Tanh activation function, a construction, a pixel-by-pixel addition and a pixel-by-pixel multiplication operation, the GRU network regards the previous input as a previous hidden state, and the previous input and the current input are subjected to feature fusion through an update gate and a reset gate so as to realize finer feature selection; wherein the convolution kernel size of ConvGRU1 is 3 and the convolution kernel size of ConvGRU2 is 5;

the reconstruction network comprises 1 convolution layer and 1 subtraction operation, wherein the convolution layer is used for converting a 64-channel characteristic image output by ConvGRU2 in the GRU network into a channel which is the same as an original noise image, the output of the 64-channel characteristic image is noise information learned by a model, the subtraction operation is used for taking the original noise image as a priori input, and the original noise image is used for subtracting the noise information to obtain a clean image;

training a model;

s41: setting the maximum training round number, a convergence threshold value and other super parameters required by training;

s42: adjusting and preparing a training data set and a test data set, and starting end-to-end training of the initial denoising model;

s43: the initial denoising model inputs a noisy image during training and outputs a denoised image; each round of training performs the following operations: judging whether the learning rate needs to be adjusted under the current round, transmitting the prepared noise image data into a network model, predicting the model to generate an image, calculating errors of the image generated by the model prediction and a clean image, calculating a loss function, reversely transmitting and updating parameters of each layer, calculating a difference value of the current round and the loss function of the previous round, comparing the difference value with a convergence threshold value to judge whether convergence or not, firstly storing the model in the training process when convergence, then carrying out the next round of training, and not storing the model in the training process when no convergence exists, and directly carrying out the next round of training;

s44: after the loss function converges and reaches the maximum training round, the model training is completed;

step five, model preservation;

saving the trained model to form an image denoising model;

step six, denoising the image;

and (3) carrying out image denoising by using an image denoising model, wherein the denoising process is a forward propagation process, inputting an image to be denoised, and outputting a predicted clean image after model processing.

Further, in order to effectively improve the processing capability of the model for specific noise pollution, in the first step, when image forming data sets polluted by noise of different degrees are collected, training data sets of different degrees need to be respectively constructed for the noise pollution of different degrees, and the training data sets need to be respectively set for different noise levels.

Further, in order to effectively expand and enhance the data set to ensure that a more excellent training effect can be obtained in the later stage, in the second step, preprocessing comprises enhancement processing and normalization processing on the data set, wherein the enhancement processing comprises image scaling processing and image inversion transformation processing, scaling of 1.0 times, 0.9 times, 0.8 times and 0.7 times is respectively carried out during the image scaling processing, and the image is subjected to inversion transformation during the image inversion transformation processing; the normalization processing comprises image format normalization processing, wherein during the image format normalization processing, whether the dimensions of the images are consistent is checked, and if the dimensions of the images are inconsistent, the dimensions of the images are transformed to keep the sizes of the images consistent.

Preferably, in step S41 of the fourth step, the maximum number of training rounds is greater than 30 times.

Further, in order to ensure that the trained model has good denoising performance, in the fourth step, the initial learning rate is set to be 1e-3 in the training process, and the learning rate is attenuated to be 10% of the original learning rate after 30 iterations.

In the invention, in the data acquisition process, training data sets are respectively set for different noise levels, and the data sets with different noise levels are well processed, so that the processing capability of a model for specific noise pollution in the subsequent training process is effectively improved. The obtained data set is preprocessed so as to be expanded and enhanced, so that the data quality can be improved, the over-fitting phenomenon can be avoided, and the overall performance can be fundamentally improved. In the model construction process, a feature enhancement network is introduced into a constructed denoising model, 4 hole convolution layers contained in the feature enhancement network and convolution layers of 4 different convolution kernels form 4 branch networks respectively, so that an input original noise image can enter the 4 branch networks simultaneously after entering the feature enhancement network, more feature information can be captured due to the fact that visual receptive fields can be increased by the hole convolution blocks, and the focus and extracted information of convolution kernels of different scales are different, so that shallow feature information extracted by the 4 branch networks is different, the focus visual field ranges are different, complementary effects can be achieved, and further richer image shallow features can be obtained after combination; the channel fusion layer is connected with the output ends of the 4 branch networks at the same time, so that the characteristic diagrams of the 16 channels output by each branch network can be fused together to form a combined characteristic diagram of 64 channels, thereby well fusing characteristic information and effectively solving the problem of insufficient acquisition of shallow characteristic information by matching with the 4 branch networks; the output of the channel fusion layer sequentially passes through 1 coding layer and 1 decoding layer, so that the channel number can be converted from 64 channels to 128 channels, and then the channel number can be converted from 128 channels to 64 channels, and thus, the shallow layer characteristics can be further enhanced in the coding and decoding stage through the combination of the encoder and the decoder. Therefore, the denoising model can better extract and utilize shallow layer characteristics, and the robustness of the model is improved. Meanwhile, a backbone network is also introduced into the constructed denoising model, wherein a batch normalization module and a PReLU activation function are arranged in the backbone network, the batch normalization module can accelerate the convergence rate of the model, the training efficiency of the model is improved, and the phenomena of fitting and gradient disappearance can be prevented; compared with a ReLU activation function adopted by most neural network methods, the PReLU activation function has stronger fitting capability, and can further enhance the robustness of the model; moreover, global residual connection is further arranged in the network, so that input and output can be effectively connected, loss of characteristic information can be effectively reduced, and the characteristic information can be effectively input into the GRU network. In addition, two layers of GRU networks are introduced into the constructed denoising model, the GRU networks are realized by adopting the technical thought of convolution operation, and the time sequence characteristic information of the GRU and the space characteristic information of the convolution network are combined, so that the characteristic information input in the earlier stage can be selected, the loss of the characteristic information is effectively reduced, and the denoising effect of the model can be remarkably improved. The two layers of GRU networks have different convolution kernels, so that the selection range can be further enlarged, and more characteristic information can be acquired; in the model training process, the difference value of the current round and the previous round of the loss function is calculated in each round, the difference value and the convergence threshold value are compared to judge whether the loss function converges or not, the training process is completed after the maximum training round is reached, and the effective training of the main denoising network can be realized. The method solves the problems that the existing denoising method based on the convolutional neural network is insufficient in utilization of the shallow features of the image, easy in losing of the feature information and incapable of well fusing the feature information, not only effectively improves the robustness of an image denoising algorithm, but also can effectively avoid the problem of insufficient features, has good denoising effect, and can obtain good image denoising effect under different noise levels.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a frame structure diagram of an image denoising model according to the present invention;

FIG. 3 is a block diagram of a feature enhanced network (EBlock) in accordance with the present invention;

FIG. 4 is a block diagram of a ConvGRU1 network in accordance with the present invention;

fig. 5 is a block diagram of a convglu 2 network according to the invention;

FIG. 6 is a noise image, i.e., a noise map to be processed, provided by an embodiment of the present invention;

fig. 7 is a view of the image of fig. 6 after denoising the image containing noise using the present invention.

Description of the embodiments

The invention is further described below with reference to the accompanying drawings.

As shown in fig. 1 to 5, the present invention provides an image denoising method based on a feature enhancement network and a GRU network, comprising the steps of:

step one, data acquisition;

step two, data preprocessing;

thirdly, constructing a model;

training a model;

step five, model preservation;

saving the trained model to form an image denoising model;

step six, denoising the image;

In order to effectively improve the processing capability of the model for specific noise pollution, in the first step, when image forming data sets polluted by noise of different degrees are collected, training data sets with different degrees are respectively constructed, and the training data sets are respectively set for different noise levels.

In order to effectively expand and strengthen the data set so as to ensure that a better training effect can be obtained in the later period, in the second step, preprocessing comprises strengthening treatment and normalization treatment on the data set, wherein the strengthening treatment comprises image scaling treatment and image overturning transformation treatment, scaling of 1.0 times, 0.9 times, 0.8 times and 0.7 times is respectively carried out when the image scaling treatment is carried out, and the image is overturned and transformed when the image overturning transformation treatment is carried out; the normalization processing comprises image format normalization processing, wherein during the image format normalization processing, whether the dimensions of the images are consistent is checked, and if the dimensions of the images are inconsistent, the dimensions of the images are transformed to keep the sizes of the images consistent.

In order to ensure that the trained model has good denoising performance, in the fourth step, the initial learning rate is set to be 1e-3 in the training process, and the learning rate is attenuated to 10% of the original learning rate after 30 iterations.

Fig. 6 shows an image containing noise provided by the embodiment of the present invention, and fig. 7 shows an image after denoising processing performed on fig. 6, where it can be found by comparing that the method of the present invention can significantly remove noise of an image, and can greatly improve the quality of an image.

Finally, it should be noted that: the above description is only for the purpose of describing in detail the technical aspects of the present invention. It should be understood that the detailed description is merely for illustrating and explaining the present invention, and not for limiting the present invention, and although the present invention has been described in detail with reference to the foregoing, it will be apparent to those skilled in the art that modifications may be made to the technical solution described in the foregoing description or equivalents may be substituted for parts thereof, and any modifications, equivalents, improvements, simple deductions etc. should be included within the scope of the present invention within the spirit and principle of the present invention.

Claims

1. The image denoising method based on the characteristic enhancement network and the GRU network is characterized by comprising the following steps of:

step one, data acquisition;

step two, data preprocessing;

thirdly, constructing a model;

training a model;

step five, model preservation;

saving the trained model to form an image denoising model;

step six, denoising the image;

2. The image denoising method based on a feature enhancement network and a GRU network according to claim 1, wherein in the first step, when image formation data sets contaminated with noise of different degrees are collected, pairs of training data sets are respectively constructed for noise contamination of different degrees, and training data sets are respectively set for different noise levels.

3. The image denoising method based on the feature enhancement network and the GRU network according to claim 1 or 2, wherein in the second step, preprocessing comprises enhancement processing and normalization processing of the data set, the enhancement processing comprises image scaling processing and image inversion transformation processing, scaling is respectively carried out by 1.0 times, 0.9 times, 0.8 times and 0.7 times when the image scaling processing is carried out, and the image is subjected to inversion transformation when the image inversion transformation processing is carried out; the normalization processing comprises image format normalization processing, wherein during the image format normalization processing, whether the dimensions of the images are consistent is checked, and if the dimensions of the images are inconsistent, the dimensions of the images are transformed to keep the sizes of the images consistent.

4. The image denoising method based on a feature enhancement network and a GRU network according to claim 1, wherein in step S41 of the fourth step, the maximum number of training rounds is greater than 30 times.

5. The image denoising method based on a feature enhancement network and a GRU network according to claim 1, wherein in the fourth step, the initial learning rate is set to 1e-3 during training, and the learning rate is attenuated to 10% of the original value every 30 iterations.