CN111127331A - Image denoising method based on pixel-level global noise estimation coding and decoding network - Google Patents

Image denoising method based on pixel-level global noise estimation coding and decoding network Download PDF

Info

Publication number
CN111127331A
CN111127331A CN201911005554.XA CN201911005554A CN111127331A CN 111127331 A CN111127331 A CN 111127331A CN 201911005554 A CN201911005554 A CN 201911005554A CN 111127331 A CN111127331 A CN 111127331A
Authority
CN
China
Prior art keywords
coding
pixel
output
decoding
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911005554.XA
Other languages
Chinese (zh)
Other versions
CN111127331B (en
Inventor
唐鹏靓
鞠国栋
沈良恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qidi Yuanjing Shenzhen Technology Co ltd
Original Assignee
Guangdong Qidi Tuwei Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Qidi Tuwei Technology Co ltd filed Critical Guangdong Qidi Tuwei Technology Co ltd
Priority to CN201911005554.XA priority Critical patent/CN111127331B/en
Publication of CN111127331A publication Critical patent/CN111127331A/en
Application granted granted Critical
Publication of CN111127331B publication Critical patent/CN111127331B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention relates to an image denoising method based on a pixel level global noise estimation coding and decoding network, which comprises the following steps: inputting an original noisy picture into an input module of a coding network, performing primary feature extraction on the original noisy picture by using convolution, and outputting an original feature map; processing the original characteristic map by a plurality of cascaded coding modules in the coding network, and outputting a denoised high-level characteristic map with smaller space size and higher semantic level; processing the high-level feature map by a plurality of decoding modules with a jump connection structure in a decoding network symmetrical to the coding network to obtain an output feature map with spatial information and semantic information removed from noise; and (4) the output characteristic graph of the decoding network passes through an output module of the decoding network, and is mapped to the output characteristic dimension by convolution processing, so that a final clear image is output. The method fully considers the characteristics of real image noise, global information and pixel values of the real image noise and the global information, and considers the denoising effect and the running speed.

Description

Image denoising method based on pixel-level global noise estimation coding and decoding network
Technical Field
The invention relates to the technical field of computer vision images, in particular to an image denoising method based on a pixel-level global noise estimation coding and decoding network.
Background
Digital images as important information carriers play more and more important roles in daily production and life, and image information is widely applied to the fields of aerospace, industrial production, military, medical treatment, communication and the like due to the characteristics of intuition and liveliness and large information content, and is closely related to the work and life of people. However, in the process of digitalization and transmission, digital images are often affected by interference of imaging equipment and external environment noise, and the like, so that the image quality is reduced, and difficulties are brought to subsequent research and processing. Therefore, the image denoising is used as a basic and very important low-level computer vision task, and has high scientific value and practical significance.
The image denoising technology is a technology for removing noise introduced in the process of obtaining an image so as to obtain an original clear image, and is an important technical means for solving the problem of image noise from the perspective of software. The image denoising is an important low-level computer vision task, provides important technical support for enabling a computer to better observe, analyze and process pictures, and has very important application value in many fields such as medical images, satellite imaging, monitoring systems and the like.
The current optimal traditional image denoising algorithm comprises a non-local self-similarity algorithm, a sparse coding algorithm, a block matching model and the like, but the algorithms have the common defects that a complex optimization step is involved in the use stage, so the time cost is high, in addition, more adjustable parameters are designed in the use process of the algorithms, and the trouble is brought to quick use. With the development of deep learning, more end-to-end deep neural networks are used in an image denoising task, for example, a deep denoising convolutional neural network (DnCNN) introduces residual connection and batch normalization operation to reduce the difficulty of deep network training and obtain a better effect of removing Gaussian noise; the ultra-deep residual codec network (REDNet) uses a codec network composed of full convolutions and introduces a symmetric skip structure to balance spatial information and receptive field information. The strong learning capability and the end-to-end simplicity of the neural network greatly improve the image denoising effect and reduce the time consumption. However, the above deep learning method aims at a synthesized image with artificially added gaussian noise, the gaussian noise in the image is irrelevant to the pixel value, and it cannot be considered that the noise of the real-world image is relevant to the pixel value in addition to the gaussian characteristic, so the generalization ability of the above method on the real-world noise image is poor, and a relatively limited effect is obtained.
Disclosure of Invention
In view of the above, it is necessary to provide an image denoising method based on a pixel-level global noise estimation coding and decoding network, which fully considers characteristics of real image noise, global information and pixel value correlation of itself, and considers denoising effect and operation speed.
An image denoising method based on a pixel level global noise estimation coding and decoding network comprises the following steps:
step 1, inputting an original noisy picture into an input module of a coding network, performing primary feature extraction on the original noisy picture by using convolution, and outputting an original feature map;
step 2, processing the original feature map by a plurality of cascaded coding modules in the coding network, wherein each coding module consists of a down-sampling layer, a convolution layer and a pixel level global noise estimation submodule, the down-sampling layer and the convolution layer carry out high-level feature extraction on the original feature map, the pixel level global noise estimation submodule carries out noise estimation and removal on each level of features, and finally, the denoised high-level feature map with smaller space size and higher semantic level is output;
step 3, processing the high-level feature map by a plurality of decoding modules with a jump connection structure in a decoding network symmetrical to the coding network, wherein each decoding module consists of an upper sampling layer and a convolution layer, combining spatial information of a front layer with high-level feature information output by the coding module, and finally obtaining an output feature map with spatial information and semantic information taken into account after noise removal;
and 4, mapping the output characteristic graph of the decoding network to the output characteristic dimension by using convolution processing through an output module of the decoding network, and outputting a final clear image.
Specifically, in step 1, the original noisy picture is input to an input module of the coding network, feature extraction of convolution of four times, that is, 3 × 3 × 32 is performed, and an original feature map is output.
The convolution kernel output channel of the output module of the decoding network is consistent with the number of the channels of the original noisy picture input by the input module.
The coding network comprises four cascaded coding modules, and each coding module consists of a down-sampling layer, four convolution layers and two pixel level global noise estimation sub-modules.
The specific method for processing the input characteristics by the encoding module in the step 2 comprises the following steps:
⑴, at the beginning of each coding module, using maximum pooling to reduce the size of input features to 1/2, performing spatial feature fusion, improving the receptive field of the convolutional network, and extracting more semantic information;
step ⑵, extracting feature information after size reduction by using two cascaded convolutional layers, and increasing the number of channels of the features by two times to extract more features, wherein in the four cascaded coding modules, the used convolutional cores are respectively 3 × 3 × 64, 3 × 3 × 128, 3 × 3 × 256, and 3 × 3 × 512 in sequence, and the convolutional cores with the same number of channels are used in the modules;
step ⑶, extracting pixel-level global noise information contained in the features through a pixel-level global noise estimation submodule, and removing the extracted noise information through residual connection in the submodule;
step ⑷, integrating and buffering the characteristics after the primary denoising by using a convolution layer to prepare for further denoising, wherein in the four cascaded coding modules, the convolution kernels used are respectively 3 × 3 × 64, 3 × 3 × 128, 3 × 3 × 256 and 3 × 3 × 512 in sequence;
step ⑸, further extracting pixel level global noise information contained in the features through a pixel level global noise estimation submodule, and removing the extracted noise information through residual connection in the submodule;
the ⑹ module finally uses a convolution layer to integrate and buffer the features after the second denoising for preparing for entering the next module, and in the four cascaded coding modules, the convolution kernels used are respectively 3 × 3 × 64, 3 × 3 × 128, 3 × 3 × 256, and 3 × 3 × 512 in sequence, so the number of channels output by the four cascaded coding modules is 64, 128, 256, and 512 in sequence.
The specific method for processing the input features by the pixel-level global noise estimation submodule of steps ⑶ and ⑸ includes the following steps:
① the input of the pixel level global noise estimation submodule is a feature map with height H, width W and channel number C, the length and width are related to the size of the noise image of the input network and the down sampling times, the channel number is consistent with the feature channel number inside the coding module where the current pixel level global noise estimation submodule is located, the internal channel numbers of the four coding modules are respectively 64, 128, 256 and 512, the input feature firstly enters the first branch and is directly output as a residual branch for providing the feature map before de-noising;
② inputting features and entering a second branch, fusing global channel information into a branch, entering a feature graph with dimension H multiplied by W multiplied by C into two cascaded convolution layers of 1 multiplied by 1, fusing the information of C channels into the global channel information of one channel, simultaneously reserving the spatial information of the features, obtaining a feature graph with dimension H multiplied by W multiplied by 1, obtaining the feature graph with the feature shape changed into HW multiplied by 1, and outputting the HW multiplied by 1 as the feature fused with the global channel information;
③ inputting features and entering a third branch, fusing global spatial information into branches, fusing spatial information with dimension H multiplied by W multiplied by C into single-point information through a global mean pooling layer, reserving channel information of the features to obtain a feature map with dimension 1 multiplied by C, and then sequentially connecting a full connection layer with dimension C/4 and a full connection layer with dimension C to correct channel features, wherein the output size is still 1 multiplied by C, and obtaining feature output fused with the global spatial information;
④ matrix-multiplying the fusion characteristics of HW × 1 global channel information and 1 × C global space information to obtain a characteristic diagram of HW × C, which not only fuses the global characteristics of the channel and the space, but also retains partial information of the space and the channel pixel level in the calculation process to complete the global noise estimation of the pixel level, then transforming the noise estimation into H × W × C with the period consistent with the input characteristic size, and finally performing characteristic mapping by using a 1 × 1 × 1 × 1 convolutional layer to obtain and output the noise estimation characteristics;
⑤, the output pixel level global noise estimation and residual branch are added pixel by pixel, and the estimated noise value is removed from the noise picture, so as to obtain the final output of the pixel level global noise estimation submodule.
The decoding network comprises four decoding modules, and each decoding module consists of a bilinear interpolation upsampling layer and four convolution layers.
The specific method for processing the input characteristics by the decoding module in the step 3 comprises the following steps:
a. at the beginning of each decoding module, for an input feature map, the spatial size of the input feature map is up-sampled to 2 times of the original spatial size by using a bilinear interpolation method, and the number of channels is unchanged so as to gradually recover the size of an input original image;
b. taking characteristic output with the same size as bilinear interpolation output space in an input module and an encoding module and a current decoding module, splicing the characteristic output with the output of the current bilinear interpolation in a channel dimension, wherein the channel number output by the input module and the first three cascaded encoding modules is respectively 32, 64, 128 and 256 in sequence, the channel number input by the four cascaded decoding modules is respectively 256, 128, 64 and 32, the channel dimension after splicing is changed into 512, 256, 128 and 64, the channel number is respectively used as the input of a subsequent convolution layer of a decoding network in sequence, and the semantic information of a denoised characteristic diagram and the space information of a shallow characteristic diagram are comprehensively utilized;
c. the spliced feature maps with the channel numbers of 512, 256, 128 and 64 are sequentially and respectively input into decoding convolutional layers of four cascaded decoding modules for feature fusion, spatial information of a shallow feature map and deep denoised semantic information are fused to obtain a large-space-size noise-removed feature map, each decoding convolutional layer is composed of convolutional layers with four same convolutional kernels, and the convolutional kernels used in the four cascaded decoding modules are respectively 3 × 3 × 256, 3 × 3 × 128, 3 × 3 × 64 and 3 × 3 × 32 in sequence, so that the channel numbers output by the four cascaded decoding modules are respectively 256, 128, 64 and 32 in sequence.
The invention has the advantages and positive effects that:
1. according to the invention, a brand-new pixel level global noise estimation module is inserted into a coding and decoding denoising network, aiming at the Gaussian characteristic of real image noise, two branches of the module respectively fuse a global channel of input characteristics and noise information in a global space, and meanwhile, the fusion mode also considers the characteristics of real image noise and single pixel value correlation, corresponding space and channel information are reserved while the channel and the space information are fused, finally, the two branches are combined to obtain pixel level global noise estimation, and then, the noise information is removed through a residual branch, so that a better denoising effect is obtained.
2. The invention has reasonable design, the pixel level global noise estimation submodule adopts a multi-dimensional fractional fusion mode, and simultaneously, each branch adopts a bottleneck structure, thereby improving the denoising effect and controlling the parameter quantity of the network. The invention uses the coding and decoding network as the backbone network, the output of the network is the clear image after denoising, the network is trained by using the input noise image and the clear image actually shot and taking the average absolute loss function as the target, the denoising effect is evaluated by comparing the output image and the actual clear image, and the algorithm operation time is considered while the simple backbone network denoising effect is improved.
Drawings
FIG. 1 is a backbone framework diagram of a pixel level global noise estimation codec network of the present invention;
FIG. 2 is a block diagram of the pixel level global noise estimation sub-module of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. It should be noted that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and all other embodiments obtained by those skilled in the art without any inventive work based on the embodiments of the present invention belong to the protection scope of the present invention.
The image denoising method based on the pixel level global noise estimation coding and decoding network provided by the embodiment of the invention, as shown in fig. 1 and fig. 2, comprises the following steps:
and step S1, inputting the original noisy picture into an input module of the coding network, wherein the input module consists of four convolution layers of 3 multiplied by 32, performing primary feature extraction on the noisy picture, and outputting an original feature map.
And step S2, sequentially inputting the original feature map into four cascaded coding modules, wherein each coding module consists of a down-sampling layer, four convolutional layers and two pixel-level global noise estimation sub-modules, the coding module performs high-level feature extraction on the input original feature map through the plurality of convolutional layers and the down-sampling layer, performs noise estimation and removal on each level of features through the pixel-level global noise estimation sub-modules, and finally outputs the feature map with smaller space size and higher semantic level after denoising.
The specific implementation method of step S2 is as follows:
and S2.1, at the beginning of each coding module, reducing the size of the input features to 1/2 of the original size by using maximum pooling, performing spatial feature fusion, improving the receptive field of the convolutional network, and extracting more semantic information.
And S2.2, extracting the feature information after the size is reduced by using two cascaded convolutional layers, increasing the number of channels of the features by two times, and extracting more features, wherein in the four cascaded coding modules, the used convolutional kernels are respectively 3 multiplied by 64, 3 multiplied by 128, 3 multiplied by 256 and 3 multiplied by 512 in sequence, and the convolutional kernels with the same number of channels are used in the modules.
And S2.3, extracting pixel level global noise information contained in the characteristics through a pixel level global noise estimation submodule, and removing the extracted noise information through residual connection in the submodule.
The specific implementation method of step S2.3 is as follows:
step S2.3.1, inputting a feature map with height H, width W and channel number C into the pixel-level global noise estimation sub-module, where the length and width are related to the size of the noise image input into the network and the number of down-sampling times, the number of channels is consistent with the number of feature channels inside the coding module in which the current pixel-level global noise estimation sub-module is located, the number of channels inside the four coding modules is 64, 128, 256, 512, respectively, and the first branch input into the sub-module is directly output as a residual branch for providing the feature map before denoising.
And S2.3.2, inputting the features and simultaneously entering a second branch, fusing global channel information into branches, entering a feature map with dimension H multiplied by W multiplied by C into two cascaded convolution layers with the dimension of 1 multiplied by 1, fusing the information of the C channels into the global channel information of one channel, simultaneously reserving the spatial information of the features to obtain a feature map with the dimension H multiplied by W multiplied by 1, and then changing the feature into HW multiplied by 1 to obtain the feature output fused with the global channel information.
And S2.3.3, inputting the features and simultaneously entering a third branch, fusing global spatial information into branches, fusing spatial information with dimension H multiplied by W multiplied by C into single-point information by a feature graph with dimension H multiplied by W multiplied by C through a global mean pooling layer, reserving channel information of the features to obtain a feature graph with dimension 1 multiplied by C, then sequentially connecting a full-connection layer with dimension C/4 and a full-connection layer with dimension C to correct the channel features, outputting the feature graph with dimension 1 multiplied by C, and obtaining feature output fused with the global spatial information.
Step S2.3.4, matrix multiplication is carried out on the global channel information fusion characteristic with the size of HW multiplied by 1 and the global space information fusion characteristic with the size of 1 multiplied by C to obtain a characteristic diagram with the size of HW multiplied by C, the characteristic not only fuses the global characteristics of the channel and the space, partial information of the space and the channel pixel level is reserved in the calculation process, the global noise estimation of the pixel level is completed, then the noise estimation is changed into H multiplied by W multiplied by C, the time period is consistent with the input characteristic size, and finally a convolution layer with the size of 1 multiplied by 1 is used for carrying out characteristic mapping to obtain and output the noise estimation characteristic.
And S2.3.5, adding the output pixel-level global noise estimation and residual branch pixel by pixel, and removing the estimated noise value from the noise picture to obtain the final output of the pixel-level global noise estimation submodule.
And S2.4, integrating and buffering the characteristics subjected to primary denoising by using a convolution layer to prepare for further denoising, wherein in the four cascaded coding modules, the convolution kernels are respectively 3 × 3 × 64, 3 × 3 × 128, 3 × 3 × 256 and 3 × 3 × 512 in sequence.
And S2.5, further extracting pixel level global noise information contained in the features through a pixel level global noise estimation module, and removing the extracted noise information through residual connection in the module.
And S2.6, integrating and buffering the characteristics subjected to secondary denoising by using a convolution layer for the module to prepare for entering the next module, wherein in the four cascaded coding modules, the used convolution kernels are respectively 3 multiplied by 64, 3 multiplied by 128, 3 multiplied by 256 and 3 multiplied by 512 in sequence, so that the channel numbers output by the four cascaded coding modules are respectively 64, 128, 256 and 512 in sequence.
And step S3, inputting the high-level feature map output in step 2 into a decoding network which is composed of four decoding modules symmetrical to the encoding network and composed of bilinear interpolation upsampling and four convolution layers. The module uses a jump connection structure to combine the rich spatial information of the front layer with the high-layer characteristic information output by the coding module, and finally obtains a characteristic diagram which gives consideration to spatial information and semantic information after noise removal.
The specific implementation method of step S3 is as follows:
and S3.1, at the beginning of each decoding module, for the input feature map, using a bilinear interpolation method to up-sample the space size of the input feature map to 2 times of the original space size, wherein the number of channels is unchanged, and the bilinear interpolation method is used for gradually recovering the size of the input original image.
And S3.2, taking characteristic outputs (namely the coding module 3 and the decoding module 1, the coding module 2 and the decoding module 2, the coding module 1 and the decoding module 3, and the input module and the decoding module 3) of the input module and the coding module, which have the same spatial size as the bilinear interpolation output in the current decoding module, splicing the characteristic outputs with the current bilinear interpolation output in channel dimension, sequentially changing the channel numbers output by the input module and the first three cascaded coding modules into 32, 64, 128 and 256, symmetrically, respectively changing the channel numbers input by the four cascaded decoding modules into 256, 128, 64 and 32, sequentially changing the channel dimensions into 512, 256, 128 and 64, sequentially and respectively serving as the input of subsequent convolution layers of the decoding network, and comprehensively utilizing the semantic information of the denoised characteristic diagram and the spatial information of the shallow layer characteristic diagram.
And S3.3, sequentially and respectively inputting the spliced feature maps with the channel numbers of 512, 256, 128 and 64 into decoding convolutional layers of four cascaded decoding modules for feature fusion, fusing spatial information of a shallow feature map and deep denoised semantic information to obtain a feature map with large spatial size and noise removed, wherein the decoding convolutional layers are formed by convolutional layers with four same convolutional kernels, and the convolutional kernels used in the four cascaded decoding modules are sequentially and respectively 3 × 3 × 256, 3 × 3 × 128, 3 × 3 × 64 and 3 × 3 × 32, so that the channel numbers output by the four cascaded decoding modules are sequentially and respectively 256, 128, 64 and 32.
And step S4, mapping the output characteristic diagram of the decoding network to the output characteristic dimension through the convolution processing of the output module, and outputting the final clear image, wherein the output channel number of the convolution kernel is consistent with the channel number of the input original image of the input module.
The de-noised clear image can be obtained through the steps.
Finally, we train the network with the mean absolute loss function as the target, and evaluate the network performance using PSNR (Peak Signal to noise Ratio), SSIM (structural similarity index) and runtime. The method comprises the following steps:
and (3) testing environment: a tensoflow frame; ubuntu16.04 system; NVIDIA GTX 1080ti GPU
And (3) testing sequence: the selected dataset was a Darmstadt Noise Dataset (DND) dataset containing 50 very high resolution true noisy-noiseless image pairs.
The test method comprises the following steps: in order to ensure fairness, the noiseless images of the data set are not disclosed to the outside, all tests are carried out by a tested party by submitting the processing results of the images containing noise through an online system, and the test effects are calculated uniformly by the system; during testing, the original format (raw) is used as network input, the calculated clear image is stored into a mat data format and submitted to a system, and the system calculates the denoising effect of the original format and the denoising effect of the original format after being converted into a standard color format (sRGB).
Testing indexes are as follows: the invention uses indexes such as PSNR, SSIM, running time and the like to evaluate. The index data are calculated by different algorithms which are popular at present, and then result comparison is carried out, so that the method is proved to obtain a better result in the field of real image denoising.
The test results were as follows:
Figure BDA0002242652340000091
as can be seen from the comparison data, the method is superior to all other methods in the aspect of denoising effect; and in the practical test, the processing speed of the invention is superior to that of the rest methods in terms of the running time. By comprehensive analysis, the method well balances the relation between the image denoising effect and the running speed, and has higher denoising level and higher running speed.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (8)

1. An image denoising method based on a pixel-level global noise estimation coding and decoding network is characterized by comprising the following steps:
step 1, inputting an original noisy picture into an input module of a coding network, performing primary feature extraction on the original noisy picture by using convolution, and outputting an original feature map;
step 2, processing the original feature map by a plurality of cascaded coding modules in the coding network, wherein each coding module consists of a down-sampling layer, a convolution layer and a pixel level global noise estimation submodule, the down-sampling layer and the convolution layer carry out high-level feature extraction on the original feature map, the pixel level global noise estimation submodule carries out noise estimation and removal on each level of features, and finally, the denoised high-level feature map with smaller space size and higher semantic level is output;
step 3, processing the high-level feature map by a plurality of decoding modules with a jump connection structure in a decoding network symmetrical to the coding network, wherein each decoding module consists of an upper sampling layer and a convolution layer, combining spatial information of a front layer with high-level feature information output by the coding module, and finally obtaining an output feature map with spatial information and semantic information taken into account after noise removal;
and 4, mapping the output characteristic graph of the decoding network to the output characteristic dimension by using convolution processing through an output module of the decoding network, and outputting a final clear image.
2. The image denoising method based on the pixel-level global noise estimation coding-decoding network as claimed in claim 1, wherein in step 1, the original noisy picture is input into an input module of the coding network, feature extraction of four times of convolution of 3 × 3 × 32 is performed, and an original feature map is output.
3. The image denoising method based on the pixel-level global noise estimation coding-decoding network of claim 2, wherein the number of the convolution kernel output channels of the output module of the decoding network is consistent with the number of the channels of the input module for inputting the original noisy picture.
4. The image denoising method based on the pixel-level global noise estimation coding-decoding network according to any one of claims 1 to 3, wherein the coding network comprises four cascaded coding modules, and each coding module is composed of a down-sampling layer, four convolutional layers and two pixel-level global noise estimation sub-modules.
5. The image denoising method based on the pixel-level global noise estimation coding-decoding network according to claim 4, wherein the specific method for processing the input features by the coding module of step 2 comprises the following steps:
⑴, at the beginning of each coding module, using maximum pooling to reduce the size of input features to 1/2, performing spatial feature fusion, improving the receptive field of the convolutional network, and extracting more semantic information;
step ⑵, extracting feature information after size reduction by using two cascaded convolutional layers, and increasing the number of channels of the features by two times to extract more features, wherein in the four cascaded coding modules, the used convolutional cores are respectively 3 × 3 × 64, 3 × 3 × 128, 3 × 3 × 256, and 3 × 3 × 512 in sequence, and the convolutional cores with the same number of channels are used in the modules;
step ⑶, extracting pixel-level global noise information contained in the features through a pixel-level global noise estimation submodule, and removing the extracted noise information through residual connection in the submodule;
step ⑷, integrating and buffering the characteristics after the primary denoising by using a convolution layer to prepare for further denoising, wherein in the four cascaded coding modules, the convolution kernels used are respectively 3 × 3 × 64, 3 × 3 × 128, 3 × 3 × 256 and 3 × 3 × 512 in sequence;
step ⑸, further extracting pixel level global noise information contained in the features through a pixel level global noise estimation submodule, and removing the extracted noise information through residual connection in the submodule;
the ⑹ module finally uses a convolution layer to integrate and buffer the features after the second denoising for preparing for entering the next module, and in the four cascaded coding modules, the convolution kernels used are respectively 3 × 3 × 64, 3 × 3 × 128, 3 × 3 × 256, and 3 × 3 × 512 in sequence, so the number of channels output by the four cascaded coding modules is 64, 128, 256, and 512 in sequence.
6. The image denoising method based on the pixel-level global noise estimation coding-decoding network of claim 5, wherein the specific method of processing the input features by the pixel-level global noise estimation sub-module of steps ⑶ and ⑸ comprises the following steps:
① the input of the pixel level global noise estimation submodule is a feature map with height H, width W and channel number C, the length and width are related to the size of the noise image of the input network and the down sampling times, the channel number is consistent with the feature channel number inside the coding module where the current pixel level global noise estimation submodule is located, the internal channel numbers of the four coding modules are respectively 64, 128, 256 and 512, the input feature firstly enters the first branch and is directly output as a residual branch for providing the feature map before de-noising;
② inputting features and entering a second branch, fusing global channel information into a branch, entering a feature graph with dimension H multiplied by W multiplied by C into two cascaded convolution layers of 1 multiplied by 1, fusing the information of C channels into the global channel information of one channel, simultaneously reserving the spatial information of the features, obtaining a feature graph with dimension H multiplied by W multiplied by 1, obtaining the feature graph with the feature shape changed into HW multiplied by 1, and outputting the HW multiplied by 1 as the feature fused with the global channel information;
③ inputting features and entering a third branch, fusing global spatial information into branches, fusing spatial information with dimension H multiplied by W multiplied by C into single-point information through a global mean pooling layer, reserving channel information of the features to obtain a feature map with dimension 1 multiplied by C, and then sequentially connecting a full connection layer with dimension C/4 and a full connection layer with dimension C to correct channel features, wherein the output size is still 1 multiplied by C, and obtaining feature output fused with the global spatial information;
④ matrix-multiplying the fusion characteristics of HW × 1 global channel information and 1 × C global space information to obtain a characteristic diagram of HW × C, which not only fuses the global characteristics of the channel and the space, but also retains partial information of the space and the channel pixel level in the calculation process to complete the global noise estimation of the pixel level, then transforming the noise estimation into H × W × C with the period consistent with the input characteristic size, and finally performing characteristic mapping by using a 1 × 1 × 1 × 1 convolutional layer to obtain and output the noise estimation characteristics;
⑤, the output pixel level global noise estimation and residual branch are added pixel by pixel, and the estimated noise value is removed from the noise picture, so as to obtain the final output of the pixel level global noise estimation submodule.
7. The image denoising method based on the pixel-level global noise estimation coding-decoding network according to any one of claims 1 to 3, wherein the decoding network comprises four decoding modules, and the decoding modules are composed of a bilinear interpolation upsampling layer and four convolution layers.
8. The image denoising method based on the pixel-level global noise estimation coding-decoding network according to claim 7, wherein the specific method for processing the input features by the decoding module of step 3 comprises the following steps:
a. at the beginning of each decoding module, for an input feature map, the spatial size of the input feature map is up-sampled to 2 times of the original spatial size by using a bilinear interpolation method, and the number of channels is unchanged so as to gradually recover the size of an input original image;
b. taking characteristic output with the same size as bilinear interpolation output space in an input module and an encoding module and a current decoding module, splicing the characteristic output with the output of the current bilinear interpolation in a channel dimension, wherein the channel number output by the input module and the first three cascaded encoding modules is respectively 32, 64, 128 and 256 in sequence, the channel number input by the four cascaded decoding modules is respectively 256, 128, 64 and 32, the channel dimension after splicing is changed into 512, 256, 128 and 64, the channel number is respectively used as the input of a subsequent convolution layer of a decoding network in sequence, and the semantic information of a denoised characteristic diagram and the space information of a shallow characteristic diagram are comprehensively utilized;
c. the spliced feature maps with the channel numbers of 512, 256, 128 and 64 are sequentially and respectively input into decoding convolutional layers of four cascaded decoding modules for feature fusion, spatial information of a shallow feature map and deep denoised semantic information are fused to obtain a large-space-size noise-removed feature map, each decoding convolutional layer is composed of convolutional layers with four same convolutional kernels, and the convolutional kernels used in the four cascaded decoding modules are respectively 3 × 3 × 256, 3 × 3 × 128, 3 × 3 × 64 and 3 × 3 × 32 in sequence, so that the channel numbers output by the four cascaded decoding modules are respectively 256, 128, 64 and 32 in sequence.
CN201911005554.XA 2019-10-22 2019-10-22 Image denoising method based on pixel-level global noise estimation coding and decoding network Active CN111127331B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911005554.XA CN111127331B (en) 2019-10-22 2019-10-22 Image denoising method based on pixel-level global noise estimation coding and decoding network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911005554.XA CN111127331B (en) 2019-10-22 2019-10-22 Image denoising method based on pixel-level global noise estimation coding and decoding network

Publications (2)

Publication Number Publication Date
CN111127331A true CN111127331A (en) 2020-05-08
CN111127331B CN111127331B (en) 2020-09-08

Family

ID=70495395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911005554.XA Active CN111127331B (en) 2019-10-22 2019-10-22 Image denoising method based on pixel-level global noise estimation coding and decoding network

Country Status (1)

Country Link
CN (1) CN111127331B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738956A (en) * 2020-06-24 2020-10-02 哈尔滨工业大学 Image denoising system based on characteristic modulation
CN112801910A (en) * 2021-02-08 2021-05-14 南京邮电大学 Channel state information image denoising method and indoor positioning model
CN113052558A (en) * 2021-03-30 2021-06-29 浙江畅尔智能装备股份有限公司 Automatic piece counting system for machining parts of power transmission tower and automatic piece counting method thereof
CN113762333A (en) * 2021-07-20 2021-12-07 广东省科学院智能制造研究所 Unsupervised anomaly detection method and system based on double-flow joint density estimation
CN114567359A (en) * 2022-03-02 2022-05-31 重庆邮电大学 CSI feedback method based on multi-resolution fusion convolution feedback network in large-scale MIMO system
CN114615499A (en) * 2022-05-07 2022-06-10 北京邮电大学 Semantic optical communication system and method for image transmission

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107871306A (en) * 2016-09-26 2018-04-03 北京眼神科技有限公司 Method and device for denoising picture
US20180293710A1 (en) * 2017-04-06 2018-10-11 Pixar De-noising images using machine learning
CN108765334A (en) * 2018-05-24 2018-11-06 北京飞搜科技有限公司 A kind of image de-noising method, device and electronic equipment
CN109559281A (en) * 2017-09-26 2019-04-02 三星电子株式会社 Image denoising neural network framework and its training method
US20190304069A1 (en) * 2018-03-29 2019-10-03 Pixar Denoising monte carlo renderings using neural networks with asymmetric loss
CN110349103A (en) * 2019-07-01 2019-10-18 昆明理工大学 It is a kind of based on deep neural network and jump connection without clean label image denoising method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107871306A (en) * 2016-09-26 2018-04-03 北京眼神科技有限公司 Method and device for denoising picture
US20180293710A1 (en) * 2017-04-06 2018-10-11 Pixar De-noising images using machine learning
CN109559281A (en) * 2017-09-26 2019-04-02 三星电子株式会社 Image denoising neural network framework and its training method
US20190304069A1 (en) * 2018-03-29 2019-10-03 Pixar Denoising monte carlo renderings using neural networks with asymmetric loss
CN108765334A (en) * 2018-05-24 2018-11-06 北京飞搜科技有限公司 A kind of image de-noising method, device and electronic equipment
CN110349103A (en) * 2019-07-01 2019-10-18 昆明理工大学 It is a kind of based on deep neural network and jump connection without clean label image denoising method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
RAPHAEL COUTURIER 等: "Image Denoising Using a Deep Encoder-Decoder Network with Skip Connections", 《INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING》 *
SHI GUO 等: "Toward Convolutional Blind Denoising of Real Photographs", 《ARXIV》 *
XIAO-JIAO MAO 等: "Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections", 《ARXIV》 *
YUDA SONG 等: "Dynamic Residual Dense Network for Image Denoising", 《SENSORS 2019》 *
梁威鹏 等: "基于GAN生成对抗网络的图像去噪及去噪原理的探究", 《科技资讯》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738956A (en) * 2020-06-24 2020-10-02 哈尔滨工业大学 Image denoising system based on characteristic modulation
CN112801910A (en) * 2021-02-08 2021-05-14 南京邮电大学 Channel state information image denoising method and indoor positioning model
CN112801910B (en) * 2021-02-08 2022-09-23 南京邮电大学 Channel state information image denoising method and indoor positioning model
CN113052558A (en) * 2021-03-30 2021-06-29 浙江畅尔智能装备股份有限公司 Automatic piece counting system for machining parts of power transmission tower and automatic piece counting method thereof
CN113052558B (en) * 2021-03-30 2023-05-30 浙江畅尔智能装备股份有限公司 Automatic counting system and automatic counting method for power transmission tower part machining
CN113762333A (en) * 2021-07-20 2021-12-07 广东省科学院智能制造研究所 Unsupervised anomaly detection method and system based on double-flow joint density estimation
CN113762333B (en) * 2021-07-20 2023-02-28 广东省科学院智能制造研究所 Unsupervised anomaly detection method and system based on double-flow joint density estimation
CN114567359A (en) * 2022-03-02 2022-05-31 重庆邮电大学 CSI feedback method based on multi-resolution fusion convolution feedback network in large-scale MIMO system
CN114615499A (en) * 2022-05-07 2022-06-10 北京邮电大学 Semantic optical communication system and method for image transmission
CN114615499B (en) * 2022-05-07 2022-09-16 北京邮电大学 Semantic optical communication system and method for image transmission

Also Published As

Publication number Publication date
CN111127331B (en) 2020-09-08

Similar Documents

Publication Publication Date Title
CN111127331B (en) Image denoising method based on pixel-level global noise estimation coding and decoding network
CN112233038A (en) True image denoising method based on multi-scale fusion and edge enhancement
CN109410261B (en) Monocular image depth estimation method based on pyramid pooling module
CN111145116B (en) Sea surface rainy day image sample augmentation method based on generation of countermeasure network
CN109800710B (en) Pedestrian re-identification system and method
CN108596841B (en) Method for realizing image super-resolution and deblurring in parallel
CN110189260B (en) Image noise reduction method based on multi-scale parallel gated neural network
CN110751649B (en) Video quality evaluation method and device, electronic equipment and storage medium
CN109584170B (en) Underwater image restoration method based on convolutional neural network
CN112419151B (en) Image degradation processing method and device, storage medium and electronic equipment
CN110009700B (en) Convolutional neural network visual depth estimation method based on RGB (red, green and blue) graph and gradient graph
CN111709900A (en) High dynamic range image reconstruction method based on global feature guidance
CN113487564B (en) Double-flow time sequence self-adaptive selection video quality evaluation method for original video of user
CN115953303B (en) Multi-scale image compressed sensing reconstruction method and system combining channel attention
CN116152591B (en) Model training method, infrared small target detection method and device and electronic equipment
CN115205147A (en) Multi-scale optimization low-illumination image enhancement method based on Transformer
CN112614061A (en) Low-illumination image brightness enhancement and super-resolution method based on double-channel coder-decoder
CN112200732B (en) Video deblurring method with clear feature fusion
CN111882516B (en) Image quality evaluation method based on visual saliency and deep neural network
CN116580192A (en) RGB-D semantic segmentation method and system based on self-adaptive context awareness network
CN115526779A (en) Infrared image super-resolution reconstruction method based on dynamic attention mechanism
CN113538402B (en) Crowd counting method and system based on density estimation
CN111242068A (en) Behavior recognition method and device based on video, electronic equipment and storage medium
CN107729885B (en) Face enhancement method based on multiple residual error learning
CN113379606A (en) Face super-resolution method based on pre-training generation model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240205

Address after: Building 1, Shuimu Yifang Building, No. 286 Nanguang Road, Dawangshan Community, Nantou Street, Nanshan District, Shenzhen City, Guangdong Province, 518000, 2207

Patentee after: Qidi Yuanjing (Shenzhen) Technology Co.,Ltd.

Country or region after: China

Address before: Unit 416, Tianan science and technology innovation building, Panyu energy saving science and Technology Park, 555 Panyu Avenue North, Panyu District, Guangzhou, Guangdong 510000

Patentee before: GUANGDONG QIDI TUWEI TECHNOLOGY CO.,LTD.

Country or region before: China

TR01 Transfer of patent right