CN110728627A

CN110728627A - Image noise reduction method, device, system and storage medium

Info

Publication number: CN110728627A
Application number: CN201810777110.7A
Authority: CN
Inventors: 陈玮逸夫; 蔡赞赞; 魏文燕
Original assignee: Ningbo Sunny Opotech Co Ltd
Current assignee: Ningbo Sunny Opotech Co Ltd
Priority date: 2018-07-16
Filing date: 2018-07-16
Publication date: 2020-01-24

Abstract

The application provides an image noise reduction method, device, system and storage medium. The method comprises the following steps: performing convolution operation on an initial image containing noise through a feature extraction layer of a convolutional neural network to extract a primary feature map; performing nonlinear transformation on the primary feature map through an activation layer of the convolutional neural network to obtain a secondary feature map; and carrying out deconvolution operation on the secondary feature map through a reconstruction layer of the convolutional neural network to obtain a noise-reduced enhanced image. The convolutional neural network is trained in advance by utilizing a plurality of groups of matched noisy training images and noiseless true value images.

Description

Image noise reduction method, device, system and storage medium

Technical Field

The present application relates to the field of image processing, and in particular, to an image denoising method, apparatus, system, and storage medium.

Background

Subject to some hardware limitations of the camera lens (such as the size of the light sensing chip, the size of the aperture, etc.), the image captured by the camera lens may have some noise or defects. For example, when the image is photographed in a low light environment, the image may have defects of blur, ghost, darkness, and much noise.

To solve the above problem, one method is to increase the exposure time to increase the amount of light entering, thereby improving the overall brightness. But doing so typically increases the difficulty of capturing and causes increased noise in the image, thereby affecting the user experience or visual perception. The other method is to perform post-processing on the image by adopting a software noise reduction method so as to improve the brightness and reduce the noise.

Some conventional image denoising methods such as total variation, wavelet preprocessing, sparse coding, kernel norm or three-dimensional matching algorithm, etc. exist in the market currently.

Disclosure of Invention

The application provides an image denoising method. The method comprises the following steps: performing convolution operation on an initial image containing noise through a feature extraction layer of a convolutional neural network to extract a primary feature map; performing nonlinear transformation on the primary feature map through an activation layer of the convolutional neural network to obtain a secondary feature map; and carrying out deconvolution operation on the secondary feature map through a reconstruction layer of the convolutional neural network to obtain a noise-reduced enhanced image. The convolutional neural network is trained in advance by utilizing a plurality of groups of matched noisy training images and noiseless true value images.

According to an embodiment of the present application, performing a convolution operation on an initial image containing noise through a feature extraction layer of a convolutional neural network to extract a primary feature map may include: cropping the initial image to fit a particular aspect ratio; and/or normalizing the initial image to convert pixel values of the initial image to within a particular range of values.

According to an embodiment of the application, the activation layer may comprise a modified linear element.

According to an embodiment of the present application, the non-linearly transforming the primary feature map by the activation layer of the convolutional neural network to obtain a secondary feature map may include: carrying out nonlinear transformation on the primary feature map through the activation layer to obtain an activation feature map; and downsampling the activation feature map through a pooling layer of the convolutional neural network to obtain the secondary feature map.

According to embodiments of the application, the pooling layers may include a maximum pooling layer or an average pooling layer.

According to an embodiment of the application, the training may include: shooting a subject under an environment with sufficient illumination to obtain the noise-free true-value image; photographing the subject in an environment with insufficient illumination to obtain the noisy training image; using the noisy training image as the initial image to obtain the reconstructed image using the convolutional neural network reconstruction; comparing the reconstructed image with the noise-free true-value image to obtain a training error; and iteratively back-propagating the training error through the convolutional neural network to modify parameters of the convolutional neural network until the training error satisfies a convergence condition.

According to an embodiment of the present application, the training process may utilize the L1 norm to regularize the parameters of the convolutional neural network.

The application also provides an image noise reduction device. The device comprises: a feature extractor for performing a convolution operation on an initial image containing noise through a feature extraction layer of the convolutional neural network to extract a primary feature map; the feature activator carries out nonlinear transformation on the primary feature map through an activation layer of the convolutional neural network to obtain a secondary feature map; the image reconstructor is used for carrying out deconvolution operation on the secondary feature map through a reconstruction layer of the convolutional neural network so as to obtain an enhanced image after noise reduction; and the trainer is used for training the convolutional neural network by utilizing a plurality of groups of matched noisy training images and noiseless true value images in advance.

The present application further provides an image noise reduction system, comprising: a processor; and a memory coupled to the processor and storing machine-readable instructions executable by the processor to perform the following operations. The operations include: performing convolution operation on an initial image containing noise through a feature extraction layer of a convolutional neural network to extract a primary feature map; performing nonlinear transformation on the primary feature map through an activation layer of the convolutional neural network to obtain a secondary feature map; and carrying out deconvolution operation on the secondary characteristic graph through a reconstruction layer of the convolutional neural network to obtain a noise-reduced enhanced image, wherein the convolutional neural network is trained by utilizing a plurality of groups of matched noisy training images and noiseless truth-value images in advance.

The present application further provides a non-transitory machine-readable storage medium having stored thereon machine-readable instructions executable by a processor to perform the following operations. The operations include: performing convolution operation on an initial image containing noise through a feature extraction layer of a convolutional neural network to extract a primary feature map; performing nonlinear transformation on the primary feature map through an activation layer of the convolutional neural network to obtain a secondary feature map; and carrying out deconvolution operation on the secondary characteristic graph through a reconstruction layer of the convolutional neural network to obtain a noise-reduced enhanced image, wherein the convolutional neural network is trained by utilizing a plurality of groups of matched noisy training images and noiseless truth-value images in advance.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is a flow chart illustrating an image denoising method according to an embodiment of the present application;

FIG. 2 is a schematic diagram illustrating a convolutional neural network for image noise reduction according to an embodiment of the present application;

FIG. 3 is a flow chart illustrating a method of training a convolutional neural network according to an embodiment of the present application;

FIG. 4 is a schematic diagram showing an image noise reduction apparatus according to an embodiment of the present application; and

FIG. 5 is a schematic diagram illustrating an image noise reduction system according to an embodiment of the present application.

Detailed Description

The present application will be described in further detail with reference to the accompanying drawings and embodiments. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant technical concepts and are not limitative of the technical concepts. It should be further noted that, for convenience of description, only portions related to the technical idea of the present application are shown in the drawings. It should be understood that, unless otherwise specified, ordinal words such as "first", "second", etc., used herein are used only to distinguish one element from another, and do not denote importance or priority. For example, the first training set and the second training set simply indicate that they are different training sets.

In addition, the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 is a flow chart illustrating an image denoising method 1000 according to an embodiment of the present application.

The image denoising method 1000 according to the present application may be implemented using a machine learning technique. For example, the image denoising method 1000 may be implemented using a CNN (Convolutional Neural Network).

In step S1010, the initial image containing noise is subjected to a convolution operation by the feature extraction layer of the CNN to extract a primary feature map. The initial image may be, for example, an image containing noise and defects taken in a low light environment. The feature extraction layer may include a plurality of convolution kernels to extract different types of features of the image.

In step S1020, the primary feature map is non-linearly transformed by the active layer of CNN to obtain a secondary feature map. The convolution kernel only linearly transforms the original image. However, linear transformations are insufficient for semantic characterization capabilities of image features. In order to enhance the semantic characterization capability of image features, a nonlinear activation layer is often required to be added. The nonlinear activation layers can perform nonlinear transformation on the primary feature map to obtain a secondary feature map with stronger semantic representation capability.

In step S1030, the secondary feature map is deconvoluted through the reconstruction layer of CNN to obtain a noise-reduced enhanced image. The deconvolution operation is the inverse of the convolution operation described above. Through the reconstruction layer, the enhanced image after noise reduction can be reconstructed based on the secondary feature map. That is, during reconstruction, some features that are characterized as noise may be removed.

The CNN provided according to the embodiments of the present application needs to be trained in advance to perfect network parameters. Such a training process requires a certain size of data set to support. Such a data set may be referred to as a training set. The training set includes a plurality of sets of training image pairs, each set of image pairs including a noisy training image and a noise-free true (Ground Truth) image. Such training image pairs resemble a test paper and a set of answers, while CNN resembles a test taker. By iteratively training repeatedly, the CNN can adapt to such an "examination system" so that a noisy training image can be reasonably reconstructed, thereby removing noise to approximate the level of a noise-free true image.

Through the technical scheme provided by the application, the CNN can perform unsupervised training to automatically perfect network parameters. Therefore, as long as a training set with a large enough data size and rich enough shooting scenes is provided, the trained CNN can efficiently complete the task of image noise reduction.

Fig. 2 is a schematic diagram illustrating a CNN2000 for image noise reduction according to an embodiment of the present application.

The CNN2000 takes the initial image 2100 as an initial input value to perform image noise reduction processing. The size of the initial image 2100 may not be limited. For example, the initial image 2100 may have an arbitrary resolution and aspect ratio. The initial image 2100 may be an RGB image and have three color channels of red, green, and blue. The image of each color channel is represented by pixel values at respective pixel points. These pixel values may be in the value range of [0,255 ].

According to one embodiment of the present application, the initial image 2100 may be pre-cropped to fit a particular aspect ratio. For example, the initial image 2100 may be cropped to have a size of 32 pixels by 32 pixels to match the CIFAR-10 dataset. Alternatively, the initial image 2100 may be cropped to have a size of 227 pixels by 227 pixels to match the ImageNet data set. Still alternatively, the initial image 2100 may be cropped to have a size of 224 pixels by 224 pixels to match the VGG16 and ResNet data sets. The cropping of the image may be manual cropping, for example, by a large number of online personnel using Amazon Mechanical Turn (AMT) services to crop the image to fit a particular aspect ratio while preserving the subject. In addition, the cropping may also be automatically extracted by an ROI (Region of Interest) extraction layer. For example, the ROI extraction layer may automatically generate a bounding box that frames out the target object, and automatically resize and crop the image to fit a particular aspect ratio based on this bounding box. The network parameters of the ROI extraction layer can be trained and optimized in the training process.

Optionally, the initial image 2100 may be normalized to convert pixel values of the initial image 2100 into a particular range of values. For example, the pixel values of each color channel of the initial image 2100 may be normalized to a numerical range of [0,1] to facilitate subsequent processing. However, as will be appreciated by those skilled in the art, since the pixel values are generally already within a fixed magnitude [0,255] value range, the normalization process is not a necessary process, but is merely an optimal choice.

As shown in fig. 2, CNN2000 may include convolutional layer 2200. One convolutional layer is shown as an example. However, as will be appreciated by those skilled in the art, to enhance the characterization capabilities of the features, multiple convolutional layers may be included in the CNN 2000. Each convolutional layer may include a plurality of convolutional kernels, which are composed of weights (Weight) and offsets (Bias). The number of convolution kernels is also referred to as the number of eigen-channels. Each convolution kernel is sensitive only to certain features of the input layer and these features can be extracted by the convolution operation. Thus, the convolutional layer 2200 may also be referred to as a feature extraction layer. The image denoising method 1000 according to an embodiment of the present application may perform a convolution operation on the initial image 2100 including noise through the feature extraction layer (e.g., convolution layer 2200) of the CNN2000 to extract a preliminary feature map.

Generally, the size of the convolution kernel is smaller than the size of the input layer, and therefore, each convolution kernel perceives only a partial region of the input layer, which is called a perceptual domain (perceptual Field). Each convolution kernel is then slid across the entire input layer in a particular step size (Stride) until all of the information of the input layer is extracted. In the process, through weight sharing, the convolution kernel can share and apply the weight and the offset of the convolution kernel to feature extraction of the whole input layer so as to greatly reduce the calculation load. However, weight sharing is not applicable to any application scenario. For some images, the user's region of interest is concentrated in a certain region of the image (e.g., the center region), and the image characteristics of this region are significantly different from other regions. In this application scenario, feature extraction may be performed on a specific region of an image through a local connection layer, and the convolution kernel weight of the local connection layer may not be shared in feature extraction on other image regions.

The CNN2000 may also include an activation layer 2300. As described above, the convolution kernel only linearly transforms the initial image. However, linear transformations are insufficient for semantic characterization capabilities of image features. In order to enhance the semantic characterization capability of image features, a nonlinear activation layer is often required to be added. Such a non-linear activation layer 2300 may perform a non-linear transformation on the primary feature map to obtain a secondary feature map with a strong semantic representation capability. Different activation functions can be configured for the activation layer 2300 according to actual needs. For example, a sigmod function may be employed to activate features. The sigmod function is a sensitive region of a neuron where the slope is large in the middle and a suppressed region of a neuron where the slopes are gentle on both sides. The output value thereof is limited to the range of [0,1 ]. Alternatively, the features may be activated using a tanh function. the tanh function is similar to the curve of the sigmod function, but the output value of the tanh function is limited to the range of [ -1,1], and the entire function output value is centered at 0.

According to an embodiment of the present application, the activation layer 2300 may include a strained Linear Unit (ReLU). Compared with the sigmod function and the tanh function, the ReLU function has no problem of gradient dispersion when the input value is a positive number. In addition, since the ReLU function has only a linear relationship, the gradient calculation is much faster and less computationally expensive than the sigmod function and the tanh function, regardless of the forward propagation or the backward propagation.

According to an embodiment of the present application, the CNN2000 may further include a Pooling (Pooling) layer 2400. The pooling layer 2400 may downsample its input layers to reduce the data size. For example, the pooling layer 2400 can downsample the feature map output by the activation layer 2300. This down-sampling operation can reduce the output size on the one hand to speed up the output processing speed and on the other hand also reduce the over-fitting phenomenon. According to one embodiment of the present application, the number of feature channels may be doubled during each downsampling.

The pooling layer 2400 may select a variety of pooling operations. According to an embodiment of the present application, the Pooling layer 2400 may be selected as an Average Pooling (Average Pooling). In the average pooling process, each pool may cover N pixel values of its output layer, the output value of each pool being the average of the N pixel values. In this way, the data size is reduced to 1/N of the original size.

According to an embodiment of the present application, the Pooling layer 2400 may employ Max Pooling (Max Pooling). In the maximum pooling process, each pool may cover N pixel values of its output layer, and the output value of each pool is a maximum value selected from the N pixel values. In this way, the data size is also reduced to 1/N of the original size.

CNN2000 may include a fully connected layer 2500. In the fully-connected layer 2500, each neuron is connected to all neurons in the upper layer. The fully-connected layer 2500 can summarize and summarize the features extracted from the first few convolutional layers to obtain a feature map representing the global features.

CNN2000 may also include a reconstruction layer 2600 to reconstruct a noise reduced enhanced image 2700. The reconstruction layer 2600 may include specific network layers such as deconvolution layers, activation layers, and full link layers. The deconvolution layer may perform the inverse of the convolution layer 2200 to obtain a reconstructed feature map or image. The reconstruction layer 2600 includes an activation layer that can have a similar configuration as the activation layer 2300 to provide a non-linear transformation for image reconstruction. The fully-connected layer included in the reconstruction layer 2600 may have a similar configuration as the fully-connected layer 2500 to restore global features of an image.

In summary, the CNN2000 is actually configured as a data processing structure similar to an encoder-decoder. Convolutional layers 2200 through fully-connected layers 2500 embody the functions of an encoder that encodes the initial image 2100 to extract the semantic features of the image. The reconstruction layer 2600, which includes a specific network layer such as an deconvolution layer, an active layer, and a full link layer, embodies the function of a decoder, which performs image reconstruction based on semantic features of an image extracted by the encoder to obtain a noise-reduced enhanced image 2700.

Fig. 3 is a flow chart illustrating a CNN training method 3000 according to an embodiment of the present application. The CNN2000 shown in fig. 2 is divided into two phases during the application process: a Training Phase (Training Phase) and a validation Phase (TestPhase). In the training stage, a training set is used for carrying out iterative training on the CNN2000 to perfect network parameters; in the verification stage, the trained CNN2000 is used for image denoising. As described above, such a training process requires a training set of a certain size to support. The training set may include a plurality of sets of training image pairs, each set of image pairs including a noisy training image and a noise-free true-value image. The training image pairs should be of sufficient number. For example, a training set may contain four or more sets of training image pairs. In addition, there should be sufficient disparity between training image pairs to mimic real-life filming scenarios. For example, a training image pair obtained by photographing a person or a building should be included.

In step S3010, the subject is photographed in an environment where the illumination is sufficient to obtain a noise-free true-value image. For example, a particular subject may be photographed in daylight or in a well lit environment to obtain a noise free truth image. Alternatively, a noise-free true-value image may be obtained using a tripod to increase exposure time in a robust capture environment.

In step S3020, a subject is photographed in an environment with insufficient lighting to obtain a noise-containing training image. For example, the specific subject may be photographed under dim lighting to obtain a noisy training image. Wherein images taken of the same subject in different lighting environments should have approximately the same content. For example, the capture positions of the noise-free true-value image and the noisy training image should be approximately the same and should not have a significant deviation.

In step S3030, the noise-containing training image is used as an initial image to obtain a reconstructed image using CNN reconstruction. This step becomes the forward propagation step. In step S3030, the initial image (e.g., initial image 2100) is fed into the CNN (e.g., CNN 2000) for image processing along the forward direction to reconstruct the image. In this process, the CNN may be configured with randomly generated initialization network parameters. Thus, the reconstructed image may still have noise, or even more noise than the original image.

In step S3040, the reconstructed image is compared with a noise-free true value image to obtain a training error. Various loss functions (or cost functions) may be used to obtain the training error. To prevent over-fitting of the training, regularization parameters may be added to the loss function to constrain the network parameters. According to one embodiment of the present application, the parameters of the CNN are normalized by using the L1 norm during the training process to ensure the generalization capability of the CNN.

In step S3050, the training error is iteratively back-propagated through the CNN to correct parameters of the CNN until the training error satisfies a convergence condition. For example, the network parameters may be iteratively optimized using SGD (Stochastic Gradient Descent) and BGD (Batch Gradient Descent) to minimize the loss function in a back-propagation process. The training process will iterate many times until the training error converges. That is, steps S3010-S3050 will iteratively run multiple times over and over to satisfy a predetermined convergence condition. The convergence condition is, for example: the training error is less than a certain threshold; the training error falls within a certain tolerance range; or the training process iterates a predetermined number of times.

In the training process, a certain strategy can be adopted to verify the generalization ability of the CNN. For example, the training set may be randomly divided into a first training set and a second training set. The first training set and the second training set may have the same number of pairs of training images. The training method 3000 described above is then performed using the training image pairs of the first training set to obtain a trained CNN. Finally, it is checked whether the trained CNN can be generalized to a training image pair of the second training set. For example, noisy training images of training image pairs of the second training set are input into the CNN to check whether the difference between the reconstructed enhanced image and the noise-free true image is within a threshold range. If the difference between the enhanced image and the noise-free true image is still greater than the threshold, it indicates that the trained CNN may not be able to generalize enough due to over-fitting, etc. At this point, the first training set and the second training set may be re-partitioned and the training method 3000 described above may be repeated. In addition, in some application scenarios, the initially input image may be oversized to create a large computational burden. In this case, the image to be processed may be divided into a plurality of sub-images, the plurality of sub-images are respectively input to the CNN, and then the plurality of processed noise reduction sub-images are stitched together.

Fig. 4 is a schematic diagram showing an image noise reduction apparatus 4000 according to an embodiment of the present application. According to the present application, the image noise reduction apparatus 4000 may include: a feature extractor 4100 that performs a convolution operation on the initial image containing noise through a feature extraction layer of the convolutional neural network to extract a primary feature map; a feature activator 4200 that performs a non-linear transformation on the primary feature map through an activation layer of the convolutional neural network to obtain a secondary feature map; the image reconstructor 4300 performs deconvolution operation on the secondary feature map through a reconstruction layer of the convolutional neural network to obtain a noise-reduced enhanced image; and a trainer 4400, training the convolution neural network by using a plurality of groups of matched noisy training images and noise-free truth value images in advance.

According to an embodiment of the present application, the feature extractor 4100 may include: a cropper that crops the initial image to conform to a particular aspect ratio; and/or a normalizer that normalizes the initial image to convert pixel values of the initial image into a particular range of values.

According to one embodiment of the application, the activation layer may comprise a modified linear element.

According to one embodiment of the present application, the feature activator 4200 may include: the nonlinear activator carries out nonlinear transformation on the primary characteristic diagram through the activation layer to obtain an activation characteristic diagram; and a pooling device down-sampling the activation signature through a pooling layer of the convolutional neural network to obtain the secondary signature.

According to one embodiment of the present application, the pooling layers may include a maximum pooling layer or an average pooling layer.

According to one embodiment of the present application, the trainer 4400 may comprise: a training set generator that photographs a subject in a well-lit environment to obtain the noise-free true-value image, and photographs the subject in a well-lit environment to obtain the noise-containing training image; a forward propagator using the noisy training image as the initial image to obtain the reconstructed image using the convolutional neural network reconstruction; a comparator to compare the reconstructed image with the noise-free true-value image to obtain a training error; and a back propagator iteratively back propagating the training error through the convolutional neural network to modify parameters of the convolutional neural network until the training error satisfies a convergence condition.

According to an embodiment of the present application, the image noise reduction apparatus 4000 may further include: a regularizer regularizing parameters of the convolutional neural network using an L1 norm during the training.

The application also provides a computer system, which can be a mobile terminal, a Personal Computer (PC), a tablet computer, a server and the like. Referring now to FIG. 5, there is shown a schematic block diagram of a computer system suitable for use in implementing the terminal device or server of the present application: as shown in fig. 5, the computer system includes one or more processors, communication sections, and the like, for example: one or more Central Processing Units (CPUs) 501, and/or one or more image processors (GPUs) 513, etc., which may perform various appropriate actions and processes according to executable instructions stored in a Read Only Memory (ROM)502 or loaded from a storage section 508 into a Random Access Memory (RAM) 503. The communication portion 512 may include, but is not limited to, a network card, which may include, but is not limited to, an ib (infiniband) network card.

The processor may communicate with the read-only memory 502 and/or the random access memory 503 to execute the executable instructions, connect with the communication part 512 through the bus 504, and communicate with other target devices through the communication part 512, so as to complete the operations corresponding to any one of the methods provided by the embodiments of the present application, for example: performing convolution operation on an initial image containing noise through a feature extraction layer of a convolutional neural network to extract a primary feature map; performing nonlinear transformation on the primary feature map through an activation layer of the convolutional neural network to obtain a secondary feature map; and carrying out deconvolution operation on the secondary characteristic graph through a reconstruction layer of the convolutional neural network to obtain a noise-reduced enhanced image, wherein the convolutional neural network is trained by utilizing a plurality of groups of matched noisy training images and noiseless truth-value images in advance.

In addition, in the RAM503, various programs and data necessary for the operation of the apparatus can also be stored. The CPU 501, ROM502, and RAM503 are connected to each other via a bus 504. The ROM502 is an optional module in case of the RAM 503. The RAM503 stores or writes executable instructions into the ROM502 at runtime, and the executable instructions cause the processor 501 to perform operations corresponding to the above-described communication method. An input/output (I/O) interface 505 is also connected to bus 504. The communication unit 512 may be integrated, or may be provided with a plurality of sub-modules (e.g., a plurality of IB network cards) and connected to the bus link.

The following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output portion 507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.

It should be noted that the architecture shown in fig. 5 is only an optional implementation manner, and in a specific practical process, the number and types of the components in fig. 5 may be selected, deleted, added or replaced according to actual needs; in different functional component settings, separate settings or integrated settings may also be used, for example, the GPU and the CPU may be separately set or the GPU may be integrated on the CPU, the communication part may be separately set or integrated on the CPU or the GPU, and so on. These alternative embodiments are all within the scope of the present disclosure.

Further, according to an embodiment of the present application, the processes described above with reference to the flowcharts may be implemented as a computer software program. For example, the present application provides a non-transitory machine-readable storage medium having stored thereon machine-readable instructions executable by a processor to perform instructions corresponding to the method steps provided herein, such as: performing convolution operation on an initial image containing noise through a feature extraction layer of a convolutional neural network to extract a primary feature map; performing nonlinear transformation on the primary feature map through an activation layer of the convolutional neural network to obtain a secondary feature map; and carrying out deconvolution operation on the secondary characteristic graph through a reconstruction layer of the convolutional neural network to obtain a noise-reduced enhanced image, wherein the convolutional neural network is trained by utilizing a plurality of groups of matched noisy training images and noiseless truth-value images in advance. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511. The computer program performs the above-described functions defined in the method of the present application when executed by the Central Processing Unit (CPU) 501.

The method and apparatus, device of the present application may be implemented in a number of ways. For example, the methods and apparatuses, devices of the present application may be implemented by software, hardware, firmware, or any combination of software, hardware, firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present application are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present application may also be embodied as a program recorded in a recording medium, the program including machine-readable instructions for implementing a method according to the present application. Thus, the present application also covers a recording medium storing a program for executing the method according to the present application.

The description of the present application has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the application in the form disclosed. Many modifications and variations will be apparent to practitioners skilled in this art. The embodiment was chosen and described in order to best explain the principles of the application and the practical application, and to enable others of ordinary skill in the art to understand the application for various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. An image noise reduction method, comprising:

performing convolution operation on an initial image containing noise through a feature extraction layer of a convolutional neural network to extract a primary feature map;

performing nonlinear transformation on the primary feature map through an activation layer of the convolutional neural network to obtain a secondary feature map; and

deconvolving the secondary feature map through a reconstruction layer of the convolutional neural network to obtain a noise-reduced enhanced image,

the convolutional neural network is trained by utilizing a plurality of groups of matched noisy training images and noiseless truth-value images in advance.

2. The method of claim 1, wherein convolving the initial image containing noise with a feature extraction layer of a convolutional neural network to extract a primary feature map comprises:

cropping the initial image to fit a particular aspect ratio; and/or

The initial image is normalized to convert pixel values of the initial image into a particular range of values.

3. The method of claim 1, wherein non-linearly transforming the primary feature map by an activation layer of the convolutional neural network to obtain a secondary feature map comprises:

carrying out nonlinear transformation on the primary feature map through the activation layer to obtain an activation feature map; and

downsampling the activation feature map through a pooling layer of the convolutional neural network to obtain the secondary feature map.

4. The method of claim 1, wherein the training comprises:

shooting a subject under an environment with sufficient illumination to obtain the noise-free true-value image;

photographing the subject in an environment with insufficient illumination to obtain the noisy training image;

using the noisy training image as the initial image to obtain the reconstructed image using the convolutional neural network reconstruction;

comparing the reconstructed image with the noise-free true-value image to obtain a training error; and

iteratively back-propagating the training error through the convolutional neural network to modify parameters of the convolutional neural network until the training error satisfies a convergence condition.

5. The method of claim 4, wherein the training process regularizes the parameters of the convolutional neural network using a L1 norm.

6. An image noise reduction apparatus equipped with a convolutional neural network, comprising:

a feature extractor for performing a convolution operation on an initial image containing noise through a feature extraction layer of the convolutional neural network to extract a primary feature map;

the feature activator carries out nonlinear transformation on the primary feature map through an activation layer of the convolutional neural network to obtain a secondary feature map;

the image reconstructor is used for carrying out deconvolution operation on the secondary feature map through a reconstruction layer of the convolutional neural network so as to obtain an enhanced image after noise reduction; and

and the trainer is used for training the convolutional neural network by utilizing a plurality of groups of matched noisy training images and noiseless true value images in advance.

7. The apparatus of claim 6, wherein the feature extractor comprises:

a cropper that crops the initial image to conform to a particular aspect ratio; and/or

A normalizer to normalize the initial image to convert pixel values of the initial image into a particular range of values.

8. The apparatus of claim 6, wherein the activation layer comprises a modified linear element.

9. The apparatus of claim 6, wherein the feature activator comprises:

the nonlinear activator carries out nonlinear transformation on the primary characteristic diagram through the activation layer to obtain an activation characteristic diagram; and

a pooling device that downsamples the activation feature map through a pooling layer of the convolutional neural network to obtain the secondary feature map.

10. The apparatus of claim 9, wherein the pooling layer comprises a maximum pooling layer or an average pooling layer.

11. The apparatus of claim 6, wherein the trainer comprises:

a training set generator that photographs a subject in a well-lit environment to obtain the noise-free true-value image, and photographs the subject in a well-lit environment to obtain the noise-containing training image;

a forward propagator using the noisy training image as the initial image to obtain the reconstructed image using the convolutional neural network reconstruction;

a comparator to compare the reconstructed image with the noise-free true-value image to obtain a training error; and

a back propagator iteratively back propagating the training error through the convolutional neural network to modify parameters of the convolutional neural network until the training error satisfies a convergence condition.

12. The apparatus of claim 11, further comprising:

a regularizer regularizing parameters of the convolutional neural network using an L1 norm during the training.

13. An image noise reduction system, the system comprising:

a processor; and

a memory coupled to the processor and storing machine-readable instructions executable by the processor to:

14. A non-transitory machine-readable storage medium storing machine-readable instructions executable by a processor to: